[SOLVED] Broken graph tops (even with 64-bit counters)

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

[SOLVED] Broken graph tops (even with 64-bit counters)

Post by mhakali »

Hello!

I have a peculiour issue with cacti and graph drawing on my router (linux 2.6.27, 64-bit kernel). Every time graphs go above a certain limit (~9MB/s) cacti stops drawing the line. Example;

http://img99.imageshack.us/img99/2520/incomingext.png

(Note: Mbit/s, however the result is the same nomatter)

Around 12:00 -> 16:00 the outgoing traffic peaked over this limit.

Hosts behind my router also being polled with 64-bit counters gets the graphcs correctly:

http://img99.imageshack.us/img99/3492/t ... nthost.png

(Note: Mbyte/s, however result is the same nomatter)

I have attempted to change to 32-bit counter polling. 64-bit of course. Both systems are 64-bit. They have the same snmpd version and launch settings. Obviously the poller (cacti) is the same server.

I changed my router from an intel server based xeon with e1000e NICs to a socket 775 platform with sky2 NICs without any difference.

What am I missing here?

Tips are welcomed! :-)

Regards,

Mikael
Last edited by mhakali on Mon Apr 13, 2009 4:55 am, edited 1 time in total.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

You've followed the guide in http://forums.cacti.net/viewtopic.php?t=24526 ?

Looked in the poller cache to verify cacti is using the HC counters instead of the regular ones for those problem devices?
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

BSOD2600 wrote:You've followed the guide in http://forums.cacti.net/viewtopic.php?t=24526 ?
Yes. 64-bit counters tried.

Besides. This dosen't happen at the 32-bit counter wrapper limit. 110Mbit/s is nearly 13MB/s where as mine goes bollocks after 8MB/s.
BSOD2600 wrote:Looked in the poller cache to verify cacti is using the HC counters instead of the regular ones for those problem devices?
This is probably where I should look up how I do this and how I actually verify it; but would you be kind to sumarize it quickly? :)
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

Look in the poller cache which OIDs cacti is using for those problem graphs.

32bit counters:
ifInOctets -> 1.3.6.1.2.1.2.2.1.10
ifOutOctets -> 1.3.6.1.2.1.2.2.1.16

64bit counters:
ifHCInOctets -> .1.3.6.1.2.1.31.1.1.1.6
ifHCOutOctets -> 1.3.6.1.2.1.31.1.1.1.10

(found in cacti\resource\snmp_queries\interface.xml).
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

BSOD2600 wrote:Look in the poller cache which OIDs cacti is using for those problem graphs.

32bit counters:
ifInOctets -> 1.3.6.1.2.1.2.2.1.10
ifOutOctets -> 1.3.6.1.2.1.2.2.1.16

64bit counters:
ifHCInOctets -> .1.3.6.1.2.1.31.1.1.1.6
ifHCOutOctets -> 1.3.6.1.2.1.31.1.1.1.10

(found in cacti\resource\snmp_queries\interface.xml).
The 64-bit counters seems to be defined as stated above in my interface.xml file.

Besides. Look at the graph again; Cacti stops drawing a line. At all. When traffic goes above 8MB/s. On that cisco example it continued to draw a line. But the line was incorrect.

Distinct difference.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

mhakali wrote:The 64-bit counters seems to be defined as stated above in my interface.xml file.
That wasn't my question. Look in the poller cache for that device to see what OIDs cacti is using: regular or HC counters?

Have you followed http://docs.cacti.net/manual:087:4_help ... #debugging yet?
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

BSOD2600 wrote: That wasn't my question. Look in the poller cache for that device to see what OIDs cacti is using: regular or HC counters?
I am not 100% sure what you are grasping for information wise. So I will try to fetch as much as possible! :-)

The information in cacti/resource/snmp_queries/interface.xml describes the interfaces as:

Code: Select all

                <ifHCInOctets>
                        <name>Bytes In - 64-bit Counters</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>output</direction>
                        <oid>.1.3.6.1.2.1.31.1.1.1.6</oid>
                </ifHCInOctets>
                <ifHCOutOctets>
                        <name>Bytes Out - 64-bit Counters</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>output</direction>
                        <oid>.1.3.6.1.2.1.31.1.1.1.10</oid>
                </ifHCOutOctets>
And I have selected to create graph using the 64 bit counters.

Doing an snmpwalk -c public -v 2c <host> gives:

Code: Select all

IF-MIB::ifHCInOctets.1 = Counter64: 3163711
IF-MIB::ifHCInOctets.2 = Counter64: 1541721923864
IF-MIB::ifHCInOctets.3 = Counter64: 3435999732
IF-MIB::ifHCInOctets.4 = Counter64: 1909967557691
IF-MIB::ifHCInOctets.5 = Counter64: 20298377446
IF-MIB::ifHCInOctets.6 = Counter64: 15795606850
IF-MIB::ifHCInOctets.7 = Counter64: 9319997
IF-MIB::ifHCInOctets.8 = Counter64: 0
IF-MIB::ifHCInOctets.10 = Counter64: 22358862380
IF-MIB::ifHCInOctets.13 = Counter64: 25299022
IF-MIB::ifHCInOctets.14 = Counter64: 72839675292
Among other things.

The interface being graphed is:

Code: Select all

IF-MIB::ifName.2 = STRING: eth0
And it's seemingly has all the required properties to get this.
Have you followed http://docs.cacti.net/
manual:087:4_help.2_debugging#debugging yet?
Poller type: cmd.php / poller.php
Polling interval: every 5th minute. Has also tried every minute. However graphs remains unchanged.

I have turned on poller debug.

I found this information in relevance of the traffic:

Code: Select all


$ grep -e 1.3.6.1.2.1.31.1.1.1.6.2 -e 1.3.6.1.2.1.31.1.1.1.10.2 cacti_debug.log 
04/12/2009 05:10:02 PM - CMDPHP: Poller[0] Host[16] DS[221] SNMP: v2: <host>, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.2, output: 1545827146683
04/12/2009 05:10:02 PM - CMDPHP: Poller[0] Host[16] DS[221] SNMP: v2: <host>, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.2, output: 1156601641255
It also seems to be able to update the RRD properly with this value:

Code: Select all

04/12/2009 05:10:03 PM - POLLER: Poller[0] CACTI2RRD: /usr/bin/rrdtool update /var/lib/cacti/rra/<host>_traffic_in_221.rrd --template traffic_out:traffic_in 1239549002:1156601641255:1545827146683
And I still get these non-drawing graphs. :(

Everything above 8MB/s just dissapears.

Regards,

Mikael
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

Update graph:

http://img230.imageshack.us/img230/2471 ... mage55.png

Controled experiment: Created an all new graph. Verified HC counter inputs. Steadily increased traffic with a few MB/s each time. After going above 8MB/s you can see the green stopping to graph entierly.

Mikael
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

Host on "the other side" (receiver) manages to output a correct graph form:

http://img410.imageshack.us/img410/5841 ... mage88.png (Note; MByte/s, not bits/s).
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

All looks good.

Try leaving the cacti logging level at Medium so you can monitor the values the device returns. Shouldn't be any reason with the HC counters why it would stop returning data completely. Last step would be to verify the DS fields in the associated rrd files are correctly sized (especially the MAX).
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

BSOD2600 wrote:All looks good.

Try leaving the cacti logging level at Medium so you can monitor the values the device returns.
Done! Lets let it run for a while and I'll return the results.
BSOD2600 wrote: Shouldn't be any reason with the HC counters why it would stop returning data completely. Last step would be to verify the DS fields in the associated rrd files are correctly sized (especially the MAX).
Lets examine the rrd file.

Code: Select all

# rrdtool info <host>_traffic_in_221.rrd
filename = "<host>_traffic_in_221.rrd"
rrd_version = "0003"
step = 300
last_update = 1239611703
ds[traffic_in].type = "COUNTER"
ds[traffic_in].minimal_heartbeat = 600
ds[traffic_in].min = 0.0000000000e+00
ds[traffic_in].max = 1.0000000000e+07
ds[traffic_in].last_ds = "1624274888643"
ds[traffic_in].value = 3.1011103000e+05
ds[traffic_in].unknown_sec = 0
Same goes for traffic_out. Not sure if the AVERAGE definitions etc are relevant

Interesting is however the max?

1 * 10^7 == 10000000

Checking a graph with working stats;

Code: Select all

# rrdtool info <another_host>_traffic_in_187.rrd
filename = "<another_host>_traffic_in_187.rrd"
rrd_version = "0003"
step = 300
last_update = 1239612003
ds[traffic_in].type = "COUNTER"
ds[traffic_in].minimal_heartbeat = 600
ds[traffic_in].min = 0.0000000000e+00
ds[traffic_in].max = 1.0000000000e+09
ds[traffic_in].last_ds = "2327382019"
ds[traffic_in].value = 2.0968692000e+05
ds[traffic_in].unknown_sec = 0
Interesting!

Now the question is why it's created like this. And how I can update the RRD file definition.

Mikael
mhakali
Posts: 13
Joined: Thu Oct 26, 2006 3:20 am

Post by mhakali »

And after a bit of tinkering;

http://img165.imageshack.us/img165/3186/graphimage1.png

Thanks for all the input!

Code: Select all

rrdtool tune <host>_traffic_in_221.rrd -a traffic_in:1000000000000
Post Reply

Who is online

Users browsing this forum: anwaraahmad1 and 0 guests