I have had this happen from time to time maybe once or twice a day, but after upgrading to 1.2.19, it seems to happen multiple times an hour. Whenever any device is slow to answer the SNMP queries, or unable to because of its own workload, the poller then is stuck for over the 300 second polling interval.
The resulting email just states:2022/02/22 12:05:17 - SYSTEM STATS: Time:16.2156 Method:spine Processes:3 Threads:16 Hosts:261 HostsPerProcess:87 DataSources:35107 RRDsProcessed:8603
2022/02/22 12:05:04 - MAILER INFO: Mail successfully sent via SMTP from 'Cacti <cacti@my.domain>', to 'Administrator <admin@my.domain>', cc '', and took 0.07 seconds, Subject 'Cacti System Warning'
2022/02/22 12:05:04 - SYSTEM STATS: Time:302.8541 Method:spine Processes:3 Threads:16 Hosts:261 HostsPerProcess:87 DataSources:35107 RRDsProcessed:8107
2022/02/22 12:03:08 - CMDPHP PHP ERROR Backtrace: (CactiShutdownHandler())
2022/02/22 12:03:08 - ERROR PHP ERROR: Maximum execution time of 299 seconds exceeded in file: /wwwsites/cacti/lib/database.php on line: 287
2022/02/22 12:00:20 - SPINE: Poller[Main Poller] PID[5228] PT[140192606308096] Device[MyRemoteDevice] HT[1] DS[MyRemoteDevice - Traffic - 192.192.192.193 - lan0] Graphs[MyRemoteDevice - Traffic - Port1 ] WARNING: SNMP timeout detected [1500 ms], ignoring host 'name.my.domain'
2022/02/22 12:00:20 - SPINE: Poller[Main Poller] PID[5228] PT[140192606308096] Device[MyRemoteDevice] HT[1] DS[MyRemoteDevice - Traffic - 192.192.192.192 - mgmt0] Graphs[MyRemoteDevice - Traffic - Port0 ] WARNING: SNMP timeout detected [1500 ms], ignoring host 'name.my.domain'
2022/02/22 12:00:20 - SPINE: Poller[Main Poller] PID[5228] PT[140192606308096] Device[MyRemoteDevice] HT[1] DS[MyRemoteDevice - Traffic - 192.192.192.192 - mgmt0] Graphs[MyRemoteDevice - Traffic - Port0 ] WARNING: SNMP timeout detected [1500 ms], ignoring host 'name.my.domain'
2022/02/22 12:00:20 - SPINE: Poller[Main Poller] PID[5228] PT[140192606308096] Device[MyRemoteDevice] HT[1] DS[MyRemoteDevice - Traffic - 8.8.8.8 - wanport1] Graphs[MyRemoteDevice - Traffic - wanport1 ] WARNING: SNMP timeout detected [1500 ms], ignoring host 'name.my.domain'
2022/02/22 12:00:20 - SPINE: Poller[Main Poller] PID[5228] PT[140192606308096] Device[MyRemoteDevice] HT[1] DS[MyRemoteDevice - Traffic - 8.8.8.8 - wanport1] Graphs[MyRemoteDevice - Traffic - wanport1 ] WARNING: SNMP timeout detected [1500 ms], ignoring host 'name.my.domain'
The command line that seems to be hanging as seen in top is:WARNING: Cacti Polling Cycle Exceeded Poller Interval by 2.8581240177155 seconds
I have had a couple instances where the poller has been stuck for over 1,000 seconds helping to cause two more polling cycles to throw errors. The "hanging" device is not always the same. I added a remote poller to eliminate the possibility that the single Cacti box is overloaded. Where do I go from here? Is there anyone else that is experiencing this that might point to a bug in this version?/usr/bin/php -q /www/cacti/poller.php --force
Thanks,
--MPJ