It doesn't happen on every poll, but after leaving it overnight I had about 50 poller processes hung, all on the same snmpget to the same device:
Code: Select all
cacti 27755 1 0 09:35 ? 00:00:00 /usr/bin/php -q /opt/cacti-0.8.6g/cmd.php 0 5
cacti 27758 27755 0 09:35 ? 00:00:00 /usr/bin/php /opt/cacti-0.8.6g/script_server.php cmd
cacti 27778 27755 0 09:35 ? 00:00:00 /usr/bin/snmpget -O vt -c <comm> -v 2c -t 1 -r 3 <IP>:161 .1.3.6.1.2.1.1.3.0
I have 1191 data sources and 403 RRDs. The poller usually takes anywhere from 40-60 seconds.
Looking back through the logs, I do see a number of "Maximum runtime exceeded" errors before yesterday's crash. That makes me lean toward a problem with the poller, rather than a mysql bug. The weird thing is that it was running absolutely fine for a week before the problems started, with no changes to the devices being monitored.
Can anyone suggest a direction for further investigation?
Thanks,
Eric