Problem reporting high bitrates on cisco switch interfaces

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
Shaggy1
Posts: 1
Joined: Thu Jan 10, 2013 4:36 am

Problem reporting high bitrates on cisco switch interfaces

Post by Shaggy1 »

Hi

I am running on rhel 6:
uname -a
Linux lb-cam-bca-cpecontrol1 2.6.32-358.23.2.el6.x86_64 #1 SMP Sat Sep 14 05:32:37 EDT 2013 x86_64 x86_64 x86_64 GNU/Linux

using cacti version 0.8.8b

I have installed cacti and created graphs for monitoring the bandwidth on the interfaces using a template:
In/Out Bits with Total Bandwidth (64-bit Counters)
which I got from the web (unfortunately I do not remember where I got it from)

The graphs are coming up and I am seeing data being graphed
I am now attempting to check that the graphed data I see is correct by sending traffic down one of the interfaces and looking at the graphical output (using jperf to do this).

The input traffic reported seems to be consistent with what I am sending (matches what is reported by jperf and on the switch) until I reach a high bandwidth (> 100M) at which point I get lower bitrates reported. If I run it fast enough for long enough the graph stops reporting any input data and starts reporting some output data.

This looks like a problem with 64 bit counters, but I think the template is for 64 bit counters (and the data query data sources in cacti are indeed ifHCInOctets and ifHCOutOctets)
Does anyone have any idea what else could cause this problem ?

I have just started using cacti and rrd, so am struggling a bit to know how to start debugging this issue.
Does anyone have any hints bout where I can start looking ? Is it possible to verify that the correct (64 bit) counters are indeed being collected ?

I'd like to be able to look at what is in the rrd database, but am not sure where I can find this. Does someone know how I can tell which rrd database relates to the graph ? I notice there are rrd database files in /usr/share/cacti/rra which end in a number - does this relate to the ID listed for the graph ? I had a look in the one with the id for my interface graph, but am not convinced it the right one (the data does not seem to match, though I still have to clarify what the data in the database should actually be).

As a side issue does anyone know what determines the name of the rrd database files ? - those for one of my devices are different to the others and I cant find the place where I typed in the string incorrectly.

While looking into this I also noticed that when I update an interface description on a cisco switch it is not reflected in cacti. Does anyone know how I can reload the description data ?

-------------------------------

Update 12th Jan 2015:
> Does someone know how I can tell which rrd database relates to the graph
Apologies I made a mistake with the ID while looking at this. The rrd graph is in /usr/share/cacti/rra postfixed with the id displayed in the cacti window.

I have managed to do a little more digging. I can manually run spine to check that the counters it is getting via snmp match those I see on the switch, which they do. This also matches up with the traffic_in counters I see reported in cacti.log and also with those manually extracted using snmp commands, so it seems as if the counters are being retrieved correctly.

Having identified the rrd database, however, I can see that the data in the rrd database is incorrect and eventually ends up as nans. The erroneous data in the database does seem to be reflected in the cacti graph, so it looks like the graphing from the database is still happening correctly.

This is pointing to something happening when the database is being updated.
Does anyone know where I can look at the update commands used to update the rrd database ?

I have also written a python script that uses rrdtool to collect the fHCInOctets and fHCOutOctets counters and update an rrd database. If I run this for just a short while with --step=5 then fetch the data in the database I also see similar erroneous values, so it looks like there is something about the values rrdtool does not like. Could I be missing a maximum value setting somewhere for this ?

-------------------------------

Update 13th Jan 2015:
Ok I have been a little stupid here. It seems the In/Out Bits with Total Bandwidth (64-bit Counters) template sets quite a low max value on the COUNTER, so whenever there are high traffic rates there is a rule violation and the value is set to nan. I have this working at least at 100M using my script.

It looks like I can change the max value from cacti in Data Templates. However the databases are already created. Does anyone know if there is a way to update/re-create the databases using the new rule without having to delete everything and start from scratch ?

-------------------------------
Update 22nd Jan 2015

I bit the bullet, deleted all the graphs + databases and re-created them.
All seems to be working now.

I'll ask on the general forum for how to get the description updated
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests