Same graph on two devices doesn't behave the same

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
DaveClose
Posts: 27
Joined: Tue Apr 08, 2008 5:36 pm

Same graph on two devices doesn't behave the same

Post by DaveClose »

To create a custom graph, I first create new data templates, then a new graph template, then I associate the graph template with a device, then I add the graph to that device. Usually it works just fine. Now I have one which behaves strangely and I'm stuck trying to debug it. See the two graphs attached.

The good graph is from the first device for which I followed this procedure, the bad graph is from the second but is similar to all subsequent devices I've tried. (I don't repeat the first two steps above for additional devices.) All debug modes do not report any errors, but I'm dumbfounded as to the origin of the data in the bad graph. I don't even know what that information could possibly mean.

SNMP queries to both devices respond correctly from the command line. What I'd like to see is what query Cacti is sending to the device and what response it is getting. Perhaps I'll resort to tcpdump. But if the query and response is correct, what can account for this strange behavior?
Attachments
This is a graph with inexplicable data generated from exactly the same data and graph templates.
This is a graph with inexplicable data generated from exactly the same data and graph templates.
bad-temp.png (43.56 KiB) Viewed 2454 times
This is a graph with apparently valid data.
This is a graph with apparently valid data.
good-temp.png (28.65 KiB) Viewed 2454 times
DaveClose
Posts: 27
Joined: Tue Apr 08, 2008 5:36 pm

Post by DaveClose »

Additional information: tcpdump shows both devices getting exactly the same queries and responses.
buck
Posts: 17
Joined: Tue Sep 04, 2007 10:09 am

Post by buck »

You can check out System Utilities -> View Poller Cache to answer your question about what query info cacti is sending.

Also setting the Log level to DEBUG will show you the data it's sending/getting back.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

And please visit both graphs at Graph Management. Switch to DEBUG and compare.
Reinhard
DaveClose
Posts: 27
Joined: Tue Apr 08, 2008 5:36 pm

Post by DaveClose »

1. I already verified the SNMP query and reply using tcpdump. That was faster and easier than Cacti's debug mode. As I wrote above, the query and response were identical for both graphs.

2. The graph management output is attached. I see no significant difference.

3. Still hoping for an explanation of the "u" and "m" units attached to the bad graph.
Attachments
good-gm.txt
Output of graph management debug mode for the good graph.
(914 Bytes) Downloaded 90 times
bad-gm.txt
Output of graph management debug mode for the bad graph.
(914 Bytes) Downloaded 88 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

DaveClose wrote:3. Still hoping for an explanation of the "u" and "m" units attached to the bad graph.
u represents micro and m represents milli as per sticky thread in this forum
Reinhard
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

The graph statements are fine. I headed for CDEFs but there are none.
Next issue is to verify data source type usage of both rrd files. To do so, please run

Code: Select all

rrdtool info ...
against the rrd files of both graphs. The first 30 lines are required only. I suspect the failing one uses COUNTER instead of GAUGE. Change this using "rrdtool tune" and verify the data template used.
Reinhard
DaveClose
Posts: 27
Joined: Tue Apr 08, 2008 5:36 pm

Post by DaveClose »

I can't get to the files at this moment, but I'll verify later today. However, the data for these graphs is actually produced as a string by an SNMP "exec" extension script. I had originally defined the data templates as GAUGE but that produced graphs with NaN data. I then changed the templates to DERIVE and the graphs started working. After that, I created the additional graph(s) and they produced the strange results.

If the additional graphs had also said NaN, I might have suspected the data source type. But what is causing the "u" and "m" data?
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Again, "u" is "micro" = 1E-06
"m" is "milli"=1E-03
Reinhard
DaveClose
Posts: 27
Joined: Tue Apr 08, 2008 5:36 pm

Post by DaveClose »

Reinhard wrote:
> Again, "u" is "micro" = 1E-06, "m" is "milli"=1E-03

I understand the unit-of-measure prefices, what I don't understand is the units. If the reported data is "micro-units", then what is "units"? Why does Cacti think the reported data is micro- or mini-?

As the data provided through SNMP is actually a string, NaN would make sense. But mini- or micro- does not make sense (to me).
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

RRDTool only stores numbers. Everything is converted to numbers. If conversion fails, rrdtool will not update.
So it's a converted number.
The fact, that the second graph only shows minimal numbers (in the range of 1E-03) IMHO shows, that the rrd file is using a COUNTER. E.g. minimal changes in temperature stored as a rate (that is: divided by rrd step = 300) will result in very minimal values. COUNTERs stores differences only, not absolute values as GAUGEs do!
I recommend changing DStype to GAUGE for the failing rrd. Old data will NOT be changed, but new data should work
Reinhard
DaveClose
Posts: 27
Joined: Tue Apr 08, 2008 5:36 pm

Post by DaveClose »

Ok, I find that the RRD files for the good graphs were recorded as GAUGE and those for the bad graph as DERIVE. I've changed the bad one and need to wait a while to see the effect. I've also change the templates. I presume my change to the templates did not affect the already created RRD files, though the change to the graph seemed to occur at the same time.

But that just makes me wonder what happened originally, when I created these templates as GAUGE and the graph reported NaN, then I changed the templates to DERIVE and the graph started working. I've searched all the RRDtool documentation I can find and I don't find any reference to how RRDtool handles string data from SNMP. If it just deletes the quotation marks and accepts an otherwise valid numeric value, great. But in that case, why did I get NaN initially?
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests