Slow Script/Command execution

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Slow Script/Command execution

Post by taliz »

Hi there,

I have been implementing cacti at my job and its becoming increasingly important for our operation.
Lately I have created some graphing for a couple of our SAN's, where I'm using check_nrpe(a nagios program) to poll them and get data.
It seems to work fine and check_nrpe gets the data within a second. However, and here's the real question, it seems that Cacti only runs one check_nrpe at a time, and I cannot understand why. Since it needs to run check_nrpe a few times for each SAN it now takes up 2/3's of the total Cacti polling time. This is a problem because we are running with a 1 minute poller interval, and there are thousands of more items we need to graph. I have tried increasing poller processes & spine threads(we're using spine) but to no help.
The machine we use, which is dedicated to Cacti, is a 8x 2,66ghz with 4gb ram and 8x 146gb 10krpm sas drives in raid10. The load is negligible, between 0 and 1, hence there are no performance issues with the Cacti host. The OS is Centos 5.3 32-bit.
I have attached relevant parts of "Tech support" and parts of the cacti.log can be found at http://taliz.rtfm.se/cactiout.log(I couldn't attach it here for some reason).

Any pointers and hints much appreciated.
Attachments
cactitech.txt
(4.55 KiB) Downloaded 102 times
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

I have also noticed that lately I have been getting large gaps at night time when the backups run. Extremely frustrating, to say the least. :(
Is there anything one can do about this? It looks like everything breaks because one or two hosts it polls get a bit lagged.

"07/20/2009 01:41:57 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted" <- broken?

Image
Image
Tried upping threads but that didnt help, it started working again when backups were done.
I suppose the only thing you can do is try to find which host it is that is lagged and disable that one?
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

I looked over our scripts and templates and noticed that we're still using temysql that isnt using the php script server, so I'm going to try and swap to the one thewitness converted.

I'm also going to try and write myself a ss_ script to run check_nrpe through, and see if that will make it spawn more of them.
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

Running check_nrpe through an ss script didnt help at all, it still only executes one at a time.

Output:
07/20/2009 10:43:36 PM - SPINE: Poller[0] Host[191] DS[5703] SS[5] SERVER: /usr/share/cacti/scripts/ss_eva.php ss_eva <censored>.150 FP4 a, output: controller:a portname:FP4 readreqps:0 readmbps:0.00 readlatencyms:0.0 writereqps:0 writembps:0.00 writelatencyms:0.0
07/20/2009 10:43:36 PM - SPINE: Poller[0] Host[188] DS[5697] SS[4] SERVER: /usr/share/cacti/scripts/ss_eva.php ss_eva <censored>.100 FP2 a, output: controller:a portname:FP2 readreqps:14 readmbps:0.90 readlatencyms:2.6 writereqps:13 writembps:0.92 writelatencyms:0.9
07/20/2009 10:43:37 PM - SPINE: Poller[0] Host[191] DS[5708] SS[6] SERVER: /usr/share/cacti/scripts/ss_eva.php ss_eva <censored>.150 FP1 b, output: controller:b portname:FP1 readreqps:0 readmbps:0.00 readlatencyms:0.0 writereqps:44 writembps:0.36 writelatencyms:0.2
07/20/2009 10:43:38 PM - SPINE: Poller[0] Host[188] DS[5704] SS[7] SERVER: /usr/share/cacti/scripts/ss_eva.php ss_eva <censored>.100 FP1 b, output: controller:b portname:FP1 readreqps:0 readmbps:0.00 readlatencyms:0.0 writereqps:0 writembps:0.00 writelatencyms:0.0
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

Is there anyone alive in this forum at all? :cry:
Lt_Flash
Posts: 6
Joined: Thu Jul 16, 2009 4:40 am

Post by Lt_Flash »

Have you tried to increase max execution time in php.ini? Set it from 30 seconds to higher value.
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

Lt_Flash wrote:Have you tried to increase max execution time in php.ini? Set it from 30 seconds to higher value.
Thanks for your reply, although I don't see how that would make it execute the scripts simultanously?
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

Perhaps I should have named this thread "parallel processing issue".
taliz
Posts: 22
Joined: Tue Feb 03, 2009 10:38 am

Post by taliz »

FWIW, I'm looking into using snmptools to poll the SAN appliance server with snmp instead of check_nrpe. If that works it should be a lot faster. It doesn't solve the parallel processing problem, but it would be a workaround.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

taliz wrote:FWIW, I'm looking into using snmptools to poll the SAN appliance server with snmp instead of check_nrpe. If that works it should be a lot faster. It doesn't solve the parallel processing problem, but it would be a workaround.
It's not a workaround but the recommended solution.
The "parallel processing" thingy may be a valid issue, but nrpe will be magnitudes slower compared to spine/snmp processing
As a workaround, you may want to pay attention to nrpe script timeout
Reinhard
User avatar
Linegod
Developer
Posts: 1626
Joined: Thu Feb 20, 2003 10:16 am
Location: Canada
Contact:

Post by Linegod »

Since I'm the king of evil hacks....

Have you considered running the check_npre as a cron, writing the output, and using a script to grab that data? Ugly evil hack, but occasionally useful....
--
Live fast, die young
You're sucking up my bandwidth.

J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
Post Reply

Who is online

Users browsing this forum: No registered users and 5 guests