About 30% of my cacti graphs (and associated RRDs) have not been updated for two days randomly stopping for no specific reason that I can find.
Looking into the issue, I can see the following for a basic Cisco CPU data collection Data Source in cacti.log:
2021/03/20 16:00:06 - SPINE: Poller[1] Device[64] HT[1] DS[1423] SNMP: v2: a.b.c.d, dsname: 5min_cpu, oid: .1.3.6.1.4.1.9.2.1.58.0, value: U
2021/03/20 16:05:05 - SPINE: Poller[1] WARNING: Invalid Response, Device[64] HT[1] DS[1423] SNMP: v2: a.b.c.d, dsname: 5min_cpu, oid: .1.3.6.1.4.1.9.2.1.58.0, value:
However, I can snmpget that OID without issue:
# snmpget -v 2c -c public a.b.c.d 1.3.6.1.4.1.9.2.1.58.0
SNMPv2-SMI::enterprises.9.2.1.58.0 = INTEGER: 2
Cacti's debug option in 1.2.16 shows the issue is invalid response received from host. The RRD, graphs, permissions, etc are fine.
There's obviously something wrong inside Spine or Cacti. Next steps?
Spine suddenly does not collect data for some hosts
Moderators: Developers, Moderators
Re: Spine suddenly does not collect data for some hosts
Just to be clear, the poller exits cleanly, though, I have about 50 devices, not 11.
2021/03/15 16:50:01 - SPINE: Poller[1] DEBUG: In Poller, About to Start Polling of Device for Device ID 5
2021/03/15 16:50:01 - SPINE: Poller[1] Device[5] DEBUG: Entering SNMP Ping
2021/03/15 16:50:01 - SPINE: Poller[1] Updating Full System Information Table
....
2021/03/20 16:51:13 - SPINE: Poller[1] Device[148] HT[1] Total Time: 7.6 Seconds
2021/03/20 16:51:13 - SPINE: Poller[1] DEVDBG: SQL:UPDATE host SET polling_time=1616219473.388 - 1616219465.788 WHERE id=148
2021/03/20 16:51:13 - SPINE: Poller[1] Device[148] HT[1] DEBUG: HOST COMPLETE: About to Exit Device Polling Thread Function
2021/03/20 16:51:13 - SPINE: Poller[1] DEBUG: The Value of Active Threads is 0 for Device ID 148
2021/03/20 16:51:13 - SPINE: Poller[1] POLLER: Active Threads is 0, Pending is 0
2021/03/20 16:51:14 - SPINE: Poller[1] SPINE: The Final Value of Threads is 0
2021/03/20 16:51:14 - SPINE: Poller[1] DEVDBG: SQL:REPLACE INTO settings (name,value) VALUES ('date',NOW())
2021/03/20 16:51:14 - SPINE: Poller[1] DEVDBG: SQL:UPDATE poller_time SET end_time=NOW() WHERE poller_id=1 AND pid=7980
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: Thread Cleanup Complete
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[0] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[1] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[2] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[3] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[4] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[5] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[6] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[7] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[8] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[9] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: PHP Script Server Pipes Closed
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: Allocated Variable Memory Freed
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: MYSQL Free & Close Completed
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: Net-SNMP Close Completed
2021/03/20 16:51:14 - SPINE: Poller[1] Time: 63.2960 s, Threads: 30, Devices: 11
2021/03/15 16:50:01 - SPINE: Poller[1] DEBUG: In Poller, About to Start Polling of Device for Device ID 5
2021/03/15 16:50:01 - SPINE: Poller[1] Device[5] DEBUG: Entering SNMP Ping
2021/03/15 16:50:01 - SPINE: Poller[1] Updating Full System Information Table
....
2021/03/20 16:51:13 - SPINE: Poller[1] Device[148] HT[1] Total Time: 7.6 Seconds
2021/03/20 16:51:13 - SPINE: Poller[1] DEVDBG: SQL:UPDATE host SET polling_time=1616219473.388 - 1616219465.788 WHERE id=148
2021/03/20 16:51:13 - SPINE: Poller[1] Device[148] HT[1] DEBUG: HOST COMPLETE: About to Exit Device Polling Thread Function
2021/03/20 16:51:13 - SPINE: Poller[1] DEBUG: The Value of Active Threads is 0 for Device ID 148
2021/03/20 16:51:13 - SPINE: Poller[1] POLLER: Active Threads is 0, Pending is 0
2021/03/20 16:51:14 - SPINE: Poller[1] SPINE: The Final Value of Threads is 0
2021/03/20 16:51:14 - SPINE: Poller[1] DEVDBG: SQL:REPLACE INTO settings (name,value) VALUES ('date',NOW())
2021/03/20 16:51:14 - SPINE: Poller[1] DEVDBG: SQL:UPDATE poller_time SET end_time=NOW() WHERE poller_id=1 AND pid=7980
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: Thread Cleanup Complete
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[0] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[1] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[2] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[3] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[4] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[5] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[6] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[7] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[8] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: SS[9] Script Server Shutdown Started
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: PHP Script Server Pipes Closed
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: Allocated Variable Memory Freed
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: MYSQL Free & Close Completed
2021/03/20 16:51:14 - SPINE: Poller[1] DEBUG: Net-SNMP Close Completed
2021/03/20 16:51:14 - SPINE: Poller[1] Time: 63.2960 s, Threads: 30, Devices: 11
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Spine suddenly does not collect data for some hosts
Test with the 1.2.x branch and ensure that you use the --poller option.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Who is online
Users browsing this forum: No registered users and 0 guests