On monday, my cacti system (cacti version 0.8.7.e and cactid as poller) suddenly began to experience very high CPU load during polling cycles and most of the graphs using perl scripts for data collection began showing no data. I still don't know what is causing this cpu load (this linux system is dedicated to Cacti). I should also add that I had more "The POPEN timed out" messages in my cacti logs. Could it be a clue about a possible cause of the problem ?
As I couldn't find the root cause of my problem, I decided to give a try to new version of poller and cacti so I upgraded to cacti 0.8.7.g and Spine 0.8.7.g. This had not improved anything : still heavy load average during polling cycles and all my perl scripts return result as 0 :
Code: Select all
SPINE: Poller[0] Host[717] TH[1] DS[4942] SCRIPT: /usr/bin/perl /var/www/cacti/scripts/riverbed/riverbed_traffic-0_0-generic.pl "SNMP_COMMUNITY" a.b.c.d , output: 0
when running the same by hand, it is OK but much slower than normal because of the system load :
Code: Select all
time /usr/bin/perl /var/www/cacti/scripts/riverbed/riverbed_traffic-0_0-generic.pl "SNMP_COMMUNITY" a.b.c.d
InLan0_0:3207129987 OutWan0_0:2216490431 OutLan0_0:297665518 InWan0_0:2332069408
real 0m13.068s
user 0m0.487s
sys 0m0.020s
When switching to CMD.php as the poller, the load average is much lower than with SPINE and the scripts are correctly executed with normal results. Only drawback with CMD is that all the polling can't be done under the 300s timeframe (because of the large number of hosts/objects to poll)
Any idea ?
Thanks