Need help with graphs showing -nan
Moderators: Developers, Moderators
Need help with graphs showing -nan
Hello everyone,
I used CACIT before but am new to the CACTI configuration world and am in need of assistance with troubleshooting a specific graph issue.
I recently installed CACTI on a new Ubuntu server VM. I then configured graphs for some Windows Servers and Network Switches.
Almost everything is working well.
On the Windows Servers (2019) I am using the default templates that came preinstalled with CACTI.
All the graphs are showing properly except the network traffic graph.
I tried multiple options of this graph (bits/sec, bytes/sec, bits/sec 64 bit...).
All of them show -nan in the graph results.
I am not sure how to proceed with troubleshooting this issue and hope someone can help.
It is possible that this is not at all related to CACTI and the issue is on the Windows Servers.
Am happy to hear your thoughts on this.
I ran the troubleshooting option on the specific datasource using the CACTI web console.
Everything showed green checkmarks except the "issue" which showed a red X with the message "Data Source returned Bad Results for traffic_in Data Source returned Bad Results for traffic_out".
I am using CACTI Version 1.2.25
Please let me know if you have any ideas on how to resolve this issue.
Thanks,
James
I used CACIT before but am new to the CACTI configuration world and am in need of assistance with troubleshooting a specific graph issue.
I recently installed CACTI on a new Ubuntu server VM. I then configured graphs for some Windows Servers and Network Switches.
Almost everything is working well.
On the Windows Servers (2019) I am using the default templates that came preinstalled with CACTI.
All the graphs are showing properly except the network traffic graph.
I tried multiple options of this graph (bits/sec, bytes/sec, bits/sec 64 bit...).
All of them show -nan in the graph results.
I am not sure how to proceed with troubleshooting this issue and hope someone can help.
It is possible that this is not at all related to CACTI and the issue is on the Windows Servers.
Am happy to hear your thoughts on this.
I ran the troubleshooting option on the specific datasource using the CACTI web console.
Everything showed green checkmarks except the "issue" which showed a red X with the message "Data Source returned Bad Results for traffic_in Data Source returned Bad Results for traffic_out".
I am using CACTI Version 1.2.25
Please let me know if you have any ideas on how to resolve this issue.
Thanks,
James
Re: Need help with graphs showing -nan
Don't use 64bit counter for windows devices, only 32bit
Let the Cacti grow!
Re: Need help with graphs showing -nan
Hi macan,
Thank you for the reply. Appreciated.
I actually tried multiple graphs, both the 64 bit and the 32 bit as well as several others.
None of these show any results.
Any other thoughts on how this can be resolved?
Thanks,
James
Thank you for the reply. Appreciated.
I actually tried multiple graphs, both the 64 bit and the 32 bit as well as several others.
None of these show any results.
Any other thoughts on how this can be resolved?
Thanks,
James
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Need help with graphs showing -nan
Sounds like most likely a permission problem or your poller is simply not running. You can easily tell by looking for "SYSTEM STATS:" in the Cacti Log. The other issue would be that your poller does not have permissions to write to the files. There is a whole troubleshooting section on the documentation pages for diagnosing such things.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Need help with graphs showing -nan
Hi TheWitness,
Thank you for your reply.
The permissions of the rra folder (I assume this is the folder you are referring to) are 0664 for the www-data user, who is also the owner.
I see that new files are created/updated in that folder constantly.
For that reason, I am not sure if this is a permissions issue.
Are there other folders I should check?
On another note, other graphs of the same Windows Server machine are showing with correct information (CPU, RAM...).
Regarding "SYSTEM STATUS", every few minutes the below information appears in the log.
Both rows appear in green, and I do not see any indications of errors in the log.
2023-12-22 12:30:25 - SYSTEM MAINT STATS: Time:0.01
2023-12-22 12:30:24 - SYSTEM STATS: Time:22.5780 Method:cmd.php Processes:1 Threads:1 Hosts:27 HostsPerProcess:27 DataSources:3427 RRDsProcessed:1372
Is there anything else I can try to resolve this?
Thanks,
James
Thank you for your reply.
The permissions of the rra folder (I assume this is the folder you are referring to) are 0664 for the www-data user, who is also the owner.
I see that new files are created/updated in that folder constantly.
For that reason, I am not sure if this is a permissions issue.
Are there other folders I should check?
On another note, other graphs of the same Windows Server machine are showing with correct information (CPU, RAM...).
Regarding "SYSTEM STATUS", every few minutes the below information appears in the log.
Both rows appear in green, and I do not see any indications of errors in the log.
2023-12-22 12:30:25 - SYSTEM MAINT STATS: Time:0.01
2023-12-22 12:30:24 - SYSTEM STATS: Time:22.5780 Method:cmd.php Processes:1 Threads:1 Hosts:27 HostsPerProcess:27 DataSources:3427 RRDsProcessed:1372
Is there anything else I can try to resolve this?
Thanks,
James
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Need help with graphs showing -nan
Are you using SNMPv2 or SNMPv3? Please advise. Also, for a few of the files, look in the RRDfile by using the command rrdtool info filename. There you will see a few interesting details:
There are few things to look for:
1. ".max" row that defines the maximum value "processed rate" that will be accepted before RRDtool basically stores a zero rate. This can happen if you setup the interface accidentally with SNMPv1, or where the interface does not report a correct ifHighSpeed number. You might have to rrdtool tune the RRDfiles.
2. ".value" row that defines processed rate value that was last updated, you always need two successive inserts in order to get a good value here. This is what appears on the Graph. If it shows zero or NaN, it means either the issue above or maybe the next row is too small.
3. ".minimal_heartbeat" row that when combined with the xfiles factor defines the time between successive polls that is allowable before RRDtool stores a NaN in the timeslot. It should be at least 2x your poller interval. I have one case where it's like 3x due to periodic poller overruns (pretty much a part of the past).
4. ".last_ds" row is the last raw value that came directly from Net-SNMP.
After reading this and spending more time with RRDtool, you should be a bit more of an expert.
Code: Select all
[root@vmhost5 rra]# rrdtool info /var/www/html/cacti/rra/64/16/1423.rrd
filename = "/var/www/html/cacti/rra/64/16/1423.rrd"
rrd_version = "0003"
step = 60
last_update = 1703269561
header_size = 5216
ds[traffic_in].index = 0
ds[traffic_in].type = "COUNTER"
ds[traffic_in].minimal_heartbeat = 600
ds[traffic_in].min = 0.0000000000e+00
ds[traffic_in].max = 1.0000000000e+09
ds[traffic_in].last_ds = "11255972213908"
ds[traffic_in].value = 4.7016949153e+03
ds[traffic_in].unknown_sec = 0
ds[traffic_out].index = 1
ds[traffic_out].type = "COUNTER"
ds[traffic_out].minimal_heartbeat = 600
ds[traffic_out].min = 0.0000000000e+00
ds[traffic_out].max = 1.0000000000e+09
ds[traffic_out].last_ds = "756265488519"
ds[traffic_out].value = 3.0872542373e+03
ds[traffic_out].unknown_sec = 0
1. ".max" row that defines the maximum value "processed rate" that will be accepted before RRDtool basically stores a zero rate. This can happen if you setup the interface accidentally with SNMPv1, or where the interface does not report a correct ifHighSpeed number. You might have to rrdtool tune the RRDfiles.
2. ".value" row that defines processed rate value that was last updated, you always need two successive inserts in order to get a good value here. This is what appears on the Graph. If it shows zero or NaN, it means either the issue above or maybe the next row is too small.
3. ".minimal_heartbeat" row that when combined with the xfiles factor defines the time between successive polls that is allowable before RRDtool stores a NaN in the timeslot. It should be at least 2x your poller interval. I have one case where it's like 3x due to periodic poller overruns (pretty much a part of the past).
4. ".last_ds" row is the last raw value that came directly from Net-SNMP.
After reading this and spending more time with RRDtool, you should be a bit more of an expert.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Need help with graphs showing -nan
Hi TheWitness,
Thank you again for the in-depth explanation.
Regarding to the version, I am using SNMPv2 for all the devices.
I ran the info command and looked for the values you pointed out (most are below).
The rra[X].cdp_prep value goes from 1 to 15 and all have either -inf or NaN.
ds[traffic_in].max = 1.0000000000e+09
ds[traffic_out].max = 1.0000000000e+09
ds[traffic_in].value = NaN
ds[traffic_out].value = NaN
rra[0].cdp_prep[0].value = NaN
rra[1].cdp_prep[0].value = 0.0000000000e+00
rra[11].cdp_prep[1].value = -inf
rra[15].cdp_prep[1].value = NaN
ds[traffic_in].minimal_heartbeat = 600
ds[traffic_out].minimal_heartbeat = 600
ds[traffic_in].last_ds = "U"
ds[traffic_out].last_ds = "U"
As my knowledge of this is quite limited, I do not fully understand the meaning of these values.
An interesting point you made (3), regarding the ".minimal_heartbeat".
I would like to increase this number. How would I proceed with doing that?
Also, does any of the above information help in getting closer to a solution?
Thanks,
James
Thank you again for the in-depth explanation.
Regarding to the version, I am using SNMPv2 for all the devices.
I ran the info command and looked for the values you pointed out (most are below).
The rra[X].cdp_prep value goes from 1 to 15 and all have either -inf or NaN.
ds[traffic_in].max = 1.0000000000e+09
ds[traffic_out].max = 1.0000000000e+09
ds[traffic_in].value = NaN
ds[traffic_out].value = NaN
rra[0].cdp_prep[0].value = NaN
rra[1].cdp_prep[0].value = 0.0000000000e+00
rra[11].cdp_prep[1].value = -inf
rra[15].cdp_prep[1].value = NaN
ds[traffic_in].minimal_heartbeat = 600
ds[traffic_out].minimal_heartbeat = 600
ds[traffic_in].last_ds = "U"
ds[traffic_out].last_ds = "U"
As my knowledge of this is quite limited, I do not fully understand the meaning of these values.
An interesting point you made (3), regarding the ".minimal_heartbeat".
I would like to increase this number. How would I proceed with doing that?
Also, does any of the above information help in getting closer to a solution?
Thanks,
James
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Need help with graphs showing -nan
There is nothing going into that RRDfile. You should check the poller cache for that RRDfile and make sure it's present. If not, you might want to consider editing the Device and picking the "Repopulate Poller Cache" link on the Edit Page for it.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Need help with graphs showing -nan
Hi TheWitness,
Thank you for the quick reply.
Searching the file name in the Poller Cache Items, yields 2 results.
Both have the same file name with different OIDs.
Here is the relevant partial information for those rows.
SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.31.1.1.1.6.3
SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.31.1.1.1.10.3
Is that normal to have 2 rows with the same file?
Should I proceed with the "Repopulate Poller Cache" option?
Thanks,
James
Thank you for the quick reply.
Searching the file name in the Poller Cache Items, yields 2 results.
Both have the same file name with different OIDs.
Here is the relevant partial information for those rows.
SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.31.1.1.1.6.3
SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.31.1.1.1.10.3
Is that normal to have 2 rows with the same file?
Should I proceed with the "Repopulate Poller Cache" option?
Thanks,
James
Re: Need help with graphs showing -nan
Yes, there are two gets that are done in bulk. Yes, repopulate. Also, you can check to see if the data source is orphaned by listing the data sources. Prefer that be done on 1.2.25 or even better 1.2.26 though.
Before history, there was a paradise, now dust.
Re: Need help with graphs showing -nan
Run spine by hand for that device too. See if data is coming back.
Code: Select all
./spine -V 3 -R -S -H device_id
Before history, there was a paradise, now dust.
Re: Need help with graphs showing -nan
Hi Osiris,
Thank you for the reply and suggestion.
I tried repopulating and waited to see if something changes, but nothing did.
Regarding spine, I am not sure if I even have it installed.
As I mentioned my knowledge of these is limited.
On the other hand, I now see some network graphs with numbers on some of the servers.
But others are still showing NAN.
Could anyone think of anything else I can try?
Can this be caused by something on the Windows side of things?
Thanks,
James
Thank you for the reply and suggestion.
I tried repopulating and waited to see if something changes, but nothing did.
Regarding spine, I am not sure if I even have it installed.
As I mentioned my knowledge of these is limited.
On the other hand, I now see some network graphs with numbers on some of the servers.
But others are still showing NAN.
Could anyone think of anything else I can try?
Can this be caused by something on the Windows side of things?
Thanks,
James
Re: Need help with graphs showing -nan
try to disable your poller in cron and run from hand
php cmd.php --poller=1 --first=1 --last=1000 --debug
In debug messages try to find which data are returned for your problematic devices. Is it number or 'U'?
php cmd.php --poller=1 --first=1 --last=1000 --debug
In debug messages try to find which data are returned for your problematic devices. Is it number or 'U'?
Let the Cacti grow!
Re: Need help with graphs showing -nan
Hi macan,
Thank you for the suggestion.
I ran the command and copied some of the output below.
I noticed that some of the graphs show numbers - these are working.
Others show U - these are the ones showing NAN.
Do you have any suggestions on how to correct the ones that are not working?
Working:
Total[0.3343] Device[2] DS[18] TT[0.29] SNMP: v2: 10.10.10.1, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.12, output: 573323646
Total[0.3347] Device[2] DS[18] TT[0.37] SNMP: v2: 10.10.10.1, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.12, output: 420581658
Total[0.7664] Device[4] DS[50] TT[0.17] SNMP: v2: 10.10.10.2, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.2, output: 3883168623
Total[0.7666] Device[4] DS[50] TT[0.18] SNMP: v2: 10.10.10.2, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, output: 4113324233
Not working:
Total[0.8198] Device[5] DS[60] TT[0.53] SNMP: v2: 10.10.10.3, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.3, output: U
Total[0.8202] Device[5] DS[60] TT[0.39] SNMP: v2: 10.10.10.3, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.3, output: U
Total[0.9156] Device[6] DS[73] TT[0.31] SNMP: v2: 10.10.10.4, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.3, output: U
Total[0.9159] Device[6] DS[73] TT[0.29] SNMP: v2: 10.10.10.4, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.3, output: U
Total[1.1141] Device[8] DS[96] TT[0.53] SNMP: v2: 10.10.10.5, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.2, output: U
Total[1.1146] Device[8] DS[96] TT[0.46] SNMP: v2: 10.10.10.5, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.2, output: U
Thanks,
James
Thank you for the suggestion.
I ran the command and copied some of the output below.
I noticed that some of the graphs show numbers - these are working.
Others show U - these are the ones showing NAN.
Do you have any suggestions on how to correct the ones that are not working?
Working:
Total[0.3343] Device[2] DS[18] TT[0.29] SNMP: v2: 10.10.10.1, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.12, output: 573323646
Total[0.3347] Device[2] DS[18] TT[0.37] SNMP: v2: 10.10.10.1, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.12, output: 420581658
Total[0.7664] Device[4] DS[50] TT[0.17] SNMP: v2: 10.10.10.2, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.2, output: 3883168623
Total[0.7666] Device[4] DS[50] TT[0.18] SNMP: v2: 10.10.10.2, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, output: 4113324233
Not working:
Total[0.8198] Device[5] DS[60] TT[0.53] SNMP: v2: 10.10.10.3, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.3, output: U
Total[0.8202] Device[5] DS[60] TT[0.39] SNMP: v2: 10.10.10.3, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.3, output: U
Total[0.9156] Device[6] DS[73] TT[0.31] SNMP: v2: 10.10.10.4, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.3, output: U
Total[0.9159] Device[6] DS[73] TT[0.29] SNMP: v2: 10.10.10.4, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.3, output: U
Total[1.1141] Device[8] DS[96] TT[0.53] SNMP: v2: 10.10.10.5, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.2, output: U
Total[1.1146] Device[8] DS[96] TT[0.46] SNMP: v2: 10.10.10.5, dsname: traffic_out, oid: .1.3.6.1.2.1.31.1.1.1.10.2, output: U
Thanks,
James
Re: Need help with graphs showing -nan
try snmpwalk for these oids:
.1.3.6.1.2.1.31.1.1.1.6.3
.1.3.6.1.2.1.31.1.1.1.10.3
snmpwalk -c your_community device_ip .1.3.6.1.2.1.31.1.1.1.6.3
What is returned?
.1.3.6.1.2.1.31.1.1.1.6.3
.1.3.6.1.2.1.31.1.1.1.10.3
snmpwalk -c your_community device_ip .1.3.6.1.2.1.31.1.1.1.6.3
What is returned?
Let the Cacti grow!
Who is online
Users browsing this forum: No registered users and 2 guests