upgraded to 0.8.7 - spine can't query some interfaces

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

User avatar
TheWitness
Developer
Posts: 17048
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

That really does not make sense. You need to look into snmp.c.

What happens is this

1) An array of OID's is passed to snmp_get_muti
2) The array is pushed into a net-snmp PDU
3) A get request is made
4) If an error is returned, it references an invalid index
5) Cacti marks the invalid index with "U" in the OID array return value
6) The invalid PDU index is removed using the snmp_fix_pdu function
7) The function is retried
8) This will continue until the snmp_get is successful, so loop 3 to 7
9) When the call is finally successful, we loop through all the responses and populate the return values, but only for those OID's in the array that did not have a "U" in their value as they had already been removed from the PDU due to prior errors.

So, if the issue is still happening, it is due to the fact that somehow, the code is offsetting somehow the index values. This is, of course complicated by the array starting at an index of 0 and the snmp PDU having a beginning index of 1. This always messes me up.

But I did look at the code and it "appeared" to be valid. However, the reason that I added a memset was due to the fact that you have so many bad responses and there is a chance that the same memory area is being "reallocated" to the same array on subsequent calls. Without initializing the memory to '0's, you have that chance.

I guess if it's still going on, we will have to work this out together online.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
gthe
Cacti User
Posts: 410
Joined: Sat Jul 29, 2006 1:23 pm
Location: RU

Post by gthe »

sorry for my poor english :oops:
1 -------------------
I add some debug line to [snmp.c] and look in log and tcpdump:

Code: Select all

	
.....................................
status = snmp_sess_synch_response(current_host->snmp_session, pdu, &response);

	/* liftoff, successful poll, process it!! */
	if (status == STAT_SUCCESS) {
		if (response == NULL) {
			SPINE_LOG(("ERROR: An internal Net-Snmp error condition detected in Cacti snmp_get_multi\n"));
			status = STAT_ERROR;
		}else{
			vars1 = response->variables;
			for(i = 0; i < num_oids && vars1; i++) {
				SPINE_LOG(("DEBUG --- 0001 = i=[%i], snmp_oids[i].result=[%s], oid=[%s], = [%i]", i, snmp_oids[i].result, snmp_oids[i].oid, (!IS_UNDEFINED(snmp_oids[i].result))));
			vars1 = vars1->next_variable;				
			}
			
			if (response->errstat == SNMP_ERR_NOERROR) {
				vars = response->variables;
				for(i = 0; i < num_oids && vars; i++) {
				  SPINE_LOG(("DEBUG --- 001 = i=[%i], snmp_oids[i].result=[%s], oid=[%s], = [%i] ", i, snmp_oids[i].result, snmp_oids[i].oid, (!IS_UNDEFINED(snmp_oids[i].result))));
          if (!IS_UNDEFINED(snmp_oids[i].result)) {
						#ifdef USE_NET_SNMP
						snmp_snprint_value(snmp_oids[i].result, sizeof(snmp_oids[i].result), vars->name, vars->name_length, vars);
						#else
						sprint_value(snmp_oids[i].result, vars->name, vars->name_length, vars);
						#endif
						SPINE_LOG(("    DEBUG --- 002 =snmp_oids[i].result=[%s]", snmp_oids[i].result));
					}
					vars = vars->next_variable;
				}
			}else{
..............
tcpdump:
16:54:24.480136 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto 17, length: 246) system.kinel.snmp > system.kinel.42300: [udp sum ok] { SNMPv2c C=pub@li@_CC22c { GetResponse(194) R=1722062306
interfaces.ifTable.ifEntry.ifInDiscards.2=0
interfaces.ifTable.ifEntry.ifInOctets.6=365135639
interfaces.ifTable.ifEntry.ifOutOctets.6=340119873
interfaces.ifTable.ifEntry.ifOutErrors.2=3
interfaces.ifTable.ifEntry.ifInOctets.7=2430124
interfaces.ifTable.ifEntry.ifOutOctets.7=4538620
interfaces.ifTable.ifEntry.ifOutNUcastPkts.2=[noSuchInstance]
interfaces.ifTable.ifEntry.ifOutErrors.8=0
interfaces.ifTable.ifEntry.ifOutDiscards.8=0
interfaces.ifTable.ifEntry.ifInErrors.8=0 } }
now from cacti log:
1/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[0], snmp_oids.result=, oid=[.1.3.6.1.2.1.2.2.1.13.2], = [0]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[1], snmp_oids.result=, oid=[.1.3.6.1.2.1.2.2.1.10.6], = [0]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[2], snmp_oids.result=[498550339], oid=[.1.3.6.1.2.1.2.2.1.16.6], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[3], snmp_oids.result=[0], oid=[.1.3.6.1.2.1.2.2.1.20.2], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[4], snmp_oids.result=[2079280152], oid=[.1.3.6.1.2.1.2.2.1.10.7], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[5], snmp_oids.result=[3736508568], oid=[.1.3.6.1.2.1.2.2.1.16.7], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[6], snmp_oids.result=[444688694], oid=[.1.3.6.1.2.1.2.2.1.18.2], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[7], snmp_oids.result=[718969462], oid=[.1.3.6.1.2.1.2.2.1.20.8], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[8], snmp_oids.result=[468424724], oid=[.1.3.6.1.2.1.2.2.1.19.8], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 0001 = i=[9], snmp_oids.result=[2947161269], oid=[.1.3.6.1.2.1.2.2.1.14.8], = [1]

11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[0], snmp_oids[i].result=, oid=[.1.3.6.1.2.1.2.2.1.13.2], = [0]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[1], snmp_oids[i].result=, oid=[.1.3.6.1.2.1.2.2.1.10.6], = [0]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[2], snmp_oids[i].result=[498550339], oid=[.1.3.6.1.2.1.2.2.1.16.6], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[0]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[3], snmp_oids[i].result=[0], oid=[.1.3.6.1.2.1.2.2.1.20.2], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[365135639]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[4], snmp_oids[i].result=[2079280152], oid=[.1.3.6.1.2.1.2.2.1.10.7], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[340119873]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[5], snmp_oids[i].result=[3736508568], oid=[.1.3.6.1.2.1.2.2.1.16.7], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[3]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[6], snmp_oids[i].result=[444688694], oid=[.1.3.6.1.2.1.2.2.1.18.2], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[2430124]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[7], snmp_oids[i].result=[718969462], oid=[.1.3.6.1.2.1.2.2.1.20.8], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[4538620]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[8], snmp_oids[i].result=[468424724], oid=[.1.3.6.1.2.1.2.2.1.19.8], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[No Such Instance currently exists at this OID]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[9], snmp_oids[i].result=[2947161269], oid=[.1.3.6.1.2.1.2.2.1.14.8], = [1]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 002 =snmp_oids[i].result=[0]

11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8774] WARNING: Result from SNMP not valid. 2 [0]=[142414188] Partial Result: ...
11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8774] SNMP: v2: 127.0.0.1, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.2, value: U

11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8770] WARNING: Result from SNMP not valid. 2 [1]=[142415728] Partial Result: ...
11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8770] SNMP: v2: 127.0.0.1, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.6, value: U

11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8770] SNMP: v2: 127.0.0.1, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.6, value: 0

11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8774] SNMP: v2: 127.0.0.1, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.2, value: 365135639

11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8771] SNMP: v2: 127.0.0.1, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.7, value: 340119873

AND:
Error # 1:
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[0], snmp_oids[i].result=, oid=[.1.3.6.1.2.1.2.2.1.13.2], = [0]
11/04/2007 04:54:24 PM - SPINE: Poller[0] DEBUG --- 001 = i=[1], snmp_oids[i].result=, oid=[.1.3.6.1.2.1.2.2.1.10.6], = [0]

This rows not be processed because they have UNDEFINED(snmp_oids[j].result), but this rows have value!:
interfaces.ifTable.ifEntry.ifInDiscards.2=0
interfaces.ifTable.ifEntry.ifInOctets.6=365135639

Error # 2: - offset
11/04/2007 04:54:24 PM - SPINE: Poller[0] Host[183] DS[8770] SNMP: v2: 127.0.0.1, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.6, value: 0

But value [0] not for oid .1.3.6.1.2.1.2.2.1.16.6, but for .1.3.6.1.2.1.2.2.1.13.2:
interfaces.ifTable.ifEntry.ifInDiscards.2=0
interfaces.ifTable.ifEntry.ifInOctets.6=365135639
interfaces.ifTable.ifEntry.ifOutOctets.6=340119873

To solve second need to change file snmp.c:
before:

Code: Select all

for(i = 0; i < num_oids && vars; i++) {
   if (!IS_UNDEFINED(snmp_oids[i].result)) {
   #ifdef USE_NET_SNMP
   snmp_snprint_value(snmp_oids[i].result, sizeof(snmp_oids[i].result), vars->name, vars->name_length, vars);
   #else
   sprint_value(snmp_oids[i].result, vars->name, vars->name_length, vars);
   #endif
   vars = vars->next_variable;
   }
}
after:

Code: Select all

for(i = 0; i < num_oids && vars; i++) {
   if (!IS_UNDEFINED(snmp_oids[i].result)) {
   #ifdef USE_NET_SNMP
   snmp_snprint_value(snmp_oids[i].result, sizeof(snmp_oids[i].result), vars->name, vars->name_length, vars);
   #else
   sprint_value(snmp_oids[i].result, vars->name, vars->name_length, vars);
   #endif
   }
vars = vars->next_variable;
}
I do not know as to solve the first problem.
Looks like the snmp_get_muti should receive arrays snmp_oids with (!IS_UNDEFINED(snmp_oids.result)) == true
User avatar
TheWitness
Developer
Posts: 17048
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Here is the resolution. I was able to reproduce the problem and resolve it on the plane from Detroit to San Francisco. Thanks for sticking with me on this and providing the good details.

In the end, I was not clearing the memory for each pass. The fix was in poller.c like I had thought, but I had the change it in a few more areas as well.

In addition, I cleaned up the snmp.c a little so as to make it a bit more readable.

TheWitness

http://forums.cacti.net/viewtopic.php?t=24110
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests