Same graph on two devices doesn't behave the same
Moderators: Developers, Moderators
Same graph on two devices doesn't behave the same
To create a custom graph, I first create new data templates, then a new graph template, then I associate the graph template with a device, then I add the graph to that device. Usually it works just fine. Now I have one which behaves strangely and I'm stuck trying to debug it. See the two graphs attached.
The good graph is from the first device for which I followed this procedure, the bad graph is from the second but is similar to all subsequent devices I've tried. (I don't repeat the first two steps above for additional devices.) All debug modes do not report any errors, but I'm dumbfounded as to the origin of the data in the bad graph. I don't even know what that information could possibly mean.
SNMP queries to both devices respond correctly from the command line. What I'd like to see is what query Cacti is sending to the device and what response it is getting. Perhaps I'll resort to tcpdump. But if the query and response is correct, what can account for this strange behavior?
The good graph is from the first device for which I followed this procedure, the bad graph is from the second but is similar to all subsequent devices I've tried. (I don't repeat the first two steps above for additional devices.) All debug modes do not report any errors, but I'm dumbfounded as to the origin of the data in the bad graph. I don't even know what that information could possibly mean.
SNMP queries to both devices respond correctly from the command line. What I'd like to see is what query Cacti is sending to the device and what response it is getting. Perhaps I'll resort to tcpdump. But if the query and response is correct, what can account for this strange behavior?
- Attachments
-
- This is a graph with inexplicable data generated from exactly the same data and graph templates.
- bad-temp.png (43.56 KiB) Viewed 2455 times
-
- This is a graph with apparently valid data.
- good-temp.png (28.65 KiB) Viewed 2455 times
1. I already verified the SNMP query and reply using tcpdump. That was faster and easier than Cacti's debug mode. As I wrote above, the query and response were identical for both graphs.
2. The graph management output is attached. I see no significant difference.
3. Still hoping for an explanation of the "u" and "m" units attached to the bad graph.
2. The graph management output is attached. I see no significant difference.
3. Still hoping for an explanation of the "u" and "m" units attached to the bad graph.
- Attachments
-
- good-gm.txt
- Output of graph management debug mode for the good graph.
- (914 Bytes) Downloaded 90 times
-
- bad-gm.txt
- Output of graph management debug mode for the bad graph.
- (914 Bytes) Downloaded 88 times
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
The graph statements are fine. I headed for CDEFs but there are none.
Next issue is to verify data source type usage of both rrd files. To do so, please runagainst the rrd files of both graphs. The first 30 lines are required only. I suspect the failing one uses COUNTER instead of GAUGE. Change this using "rrdtool tune" and verify the data template used.
Reinhard
Next issue is to verify data source type usage of both rrd files. To do so, please run
Code: Select all
rrdtool info ...
Reinhard
I can't get to the files at this moment, but I'll verify later today. However, the data for these graphs is actually produced as a string by an SNMP "exec" extension script. I had originally defined the data templates as GAUGE but that produced graphs with NaN data. I then changed the templates to DERIVE and the graphs started working. After that, I created the additional graph(s) and they produced the strange results.
If the additional graphs had also said NaN, I might have suspected the data source type. But what is causing the "u" and "m" data?
If the additional graphs had also said NaN, I might have suspected the data source type. But what is causing the "u" and "m" data?
Reinhard wrote:
> Again, "u" is "micro" = 1E-06, "m" is "milli"=1E-03
I understand the unit-of-measure prefices, what I don't understand is the units. If the reported data is "micro-units", then what is "units"? Why does Cacti think the reported data is micro- or mini-?
As the data provided through SNMP is actually a string, NaN would make sense. But mini- or micro- does not make sense (to me).
> Again, "u" is "micro" = 1E-06, "m" is "milli"=1E-03
I understand the unit-of-measure prefices, what I don't understand is the units. If the reported data is "micro-units", then what is "units"? Why does Cacti think the reported data is micro- or mini-?
As the data provided through SNMP is actually a string, NaN would make sense. But mini- or micro- does not make sense (to me).
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
RRDTool only stores numbers. Everything is converted to numbers. If conversion fails, rrdtool will not update.
So it's a converted number.
The fact, that the second graph only shows minimal numbers (in the range of 1E-03) IMHO shows, that the rrd file is using a COUNTER. E.g. minimal changes in temperature stored as a rate (that is: divided by rrd step = 300) will result in very minimal values. COUNTERs stores differences only, not absolute values as GAUGEs do!
I recommend changing DStype to GAUGE for the failing rrd. Old data will NOT be changed, but new data should work
Reinhard
So it's a converted number.
The fact, that the second graph only shows minimal numbers (in the range of 1E-03) IMHO shows, that the rrd file is using a COUNTER. E.g. minimal changes in temperature stored as a rate (that is: divided by rrd step = 300) will result in very minimal values. COUNTERs stores differences only, not absolute values as GAUGEs do!
I recommend changing DStype to GAUGE for the failing rrd. Old data will NOT be changed, but new data should work
Reinhard
Ok, I find that the RRD files for the good graphs were recorded as GAUGE and those for the bad graph as DERIVE. I've changed the bad one and need to wait a while to see the effect. I've also change the templates. I presume my change to the templates did not affect the already created RRD files, though the change to the graph seemed to occur at the same time.
But that just makes me wonder what happened originally, when I created these templates as GAUGE and the graph reported NaN, then I changed the templates to DERIVE and the graph started working. I've searched all the RRDtool documentation I can find and I don't find any reference to how RRDtool handles string data from SNMP. If it just deletes the quotation marks and accepts an otherwise valid numeric value, great. But in that case, why did I get NaN initially?
But that just makes me wonder what happened originally, when I created these templates as GAUGE and the graph reported NaN, then I changed the templates to DERIVE and the graph started working. I've searched all the RRDtool documentation I can find and I don't find any reference to how RRDtool handles string data from SNMP. If it just deletes the quotation marks and accepts an otherwise valid numeric value, great. But in that case, why did I get NaN initially?
Who is online
Users browsing this forum: No registered users and 0 guests