SNMP timeout errors with SPINE

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
thardie
Posts: 4
Joined: Fri May 21, 2010 12:28 pm

SNMP timeout errors with SPINE

Post by thardie »

I've done quite a bit of digging into this. I'm 99% sure this is a SPINE/netsnmp bug. I see this show up on my network for at least 1 host every interval. I've done sniffs at the time, and every single get request has a response in the sniff. I'd ordinarility post the sniffs, but they have our production SNMP community string in them. I've done several tests, and every time poller complains about a timeout, the sniff shows every get has a response.

FYI, this is the error I'm talking about (I'm running 216 hosts, 1 process, 10 threads per process, 10 requests max):


05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2874] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2873] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2872] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2871] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2870] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2869] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2868] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2094] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2094] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'
05/28/2010 09:45:15 PM - SPINE: Poller[0] Host[134] DS[2093] WARNING: SNMP timeout detected [500 ms], ignoring host '208.93.137.20'

I'm a C/C++ developer, and will be happy to assist in debugging this with a developer.
skol
Posts: 41
Joined: Mon Nov 10, 2003 3:06 pm

Post by skol »

Same thing, although I've got 10,000 devices. Seems to not affect the graphing of cacti, just this error might be spurious?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

There are multiple reasons for this:

1) Device snmp community is wrong
2) SNMP timeout too low
3) Device Max OID's are too high causing longer time to gather data
4) SNMP Ping does not work correctly for the device.

Remediation steps:

1) Increase SNMP Timeout. For busy devices, I have had to go as high as 4 seconds.
2) Reduce Max OID's (remember if it's a WAN device aka high latency, this will slow polling).

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
skol
Posts: 41
Joined: Mon Nov 10, 2003 3:06 pm

Re:

Post by skol »

TheWitness wrote: Remediation steps:

1) Increase SNMP Timeout. For busy devices, I have had to go as high as 4 seconds.
2) Reduce Max OID's (remember if it's a WAN device aka high latency, this will slow polling).
These settings have been in use before the errors started happening (both global setting and host setting) :

- SNMP timeout = 2000
- Max OID's = 1

Spine still shows :

Code: Select all

09/27/2010 09:36:56 AM - SPINE: Poller[0] Host[3370] DS[71096] WARNING: SNMP timeout detected [500 ms], ignoring host 'hostname'
Graphs continue to graph, however.
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: SNMP timeout errors with SPINE

Post by TheWitness »

That DS is likely wonkers.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
skol
Posts: 41
Joined: Mon Nov 10, 2003 3:06 pm

Re: SNMP timeout errors with SPINE

Post by skol »

As of today the errors stopped, without any human interaction. /shrug
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests