Disk Graphs show totals as "nan" now
Moderators: Developers, Moderators
Disk Graphs show totals as "nan" now
I just noticed that all of my disk totals (total current, total average, and total maximum) now show "nan". The used are still reporting accurately.
I haven't done any updates so I can't think of any reason why this started.
I also have the realtime plugin installed and it accurately gets the totals but nothing shows up on my old graphs
I hadn't been paying attention to totals as we use nagios for monitoring but this is kind of bothersome and would like to know if there is a way to figure out why it's not graphing.
Thanks in advance
I haven't done any updates so I can't think of any reason why this started.
I also have the realtime plugin installed and it accurately gets the totals but nothing shows up on my old graphs
I hadn't been paying attention to totals as we use nagios for monitoring but this is kind of bothersome and would like to know if there is a way to figure out why it's not graphing.
Thanks in advance
snmp walk outputs:
HOST-RESOURCES-MIB::hrStorageDescr.1 = STRING: A:\
HOST-RESOURCES-MIB::hrStorageDescr.2 = STRING: C:\ Label: Serial Number 9a29de6c
HOST-RESOURCES-MIB::hrStorageDescr.3 = STRING: E:\ Label:data Serial Number ba8ef2c9
HOST-RESOURCES-MIB::hrStorageDescr.4 = STRING: Virtual Memory
HOST-RESOURCES-MIB::hrStorageDescr.5 = STRING: Physical Memory
HOST-RESOURCES-MIB::hrStorageAllocationUnits.1 = INTEGER: 0 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.2 = INTEGER: 4096 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.3 = INTEGER: 4096 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.4 = INTEGER: 65536 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.5 = INTEGER: 65536 Bytes
HOST-RESOURCES-MIB::hrStorageSize.1 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageSize.2 = INTEGER: 5243206
HOST-RESOURCES-MIB::hrStorageSize.3 = INTEGER: 10486420
HOST-RESOURCES-MIB::hrStorageSize.4 = INTEGER: 128449
HOST-RESOURCES-MIB::hrStorageSize.5 = INTEGER: 65531
HOST-RESOURCES-MIB::hrStorageUsed.1 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageUsed.2 = INTEGER: 3538536
HOST-RESOURCES-MIB::hrStorageUsed.3 = INTEGER: 4446042
HOST-RESOURCES-MIB::hrStorageUsed.4 = INTEGER: 43307
HOST-RESOURCES-MIB::hrStorageUsed.5 = INTEGER: 35974
HOST-RESOURCES-MIB::hrStorageDescr.1 = STRING: A:\
HOST-RESOURCES-MIB::hrStorageDescr.2 = STRING: C:\ Label: Serial Number 9a29de6c
HOST-RESOURCES-MIB::hrStorageDescr.3 = STRING: E:\ Label:data Serial Number ba8ef2c9
HOST-RESOURCES-MIB::hrStorageDescr.4 = STRING: Virtual Memory
HOST-RESOURCES-MIB::hrStorageDescr.5 = STRING: Physical Memory
HOST-RESOURCES-MIB::hrStorageAllocationUnits.1 = INTEGER: 0 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.2 = INTEGER: 4096 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.3 = INTEGER: 4096 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.4 = INTEGER: 65536 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.5 = INTEGER: 65536 Bytes
HOST-RESOURCES-MIB::hrStorageSize.1 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageSize.2 = INTEGER: 5243206
HOST-RESOURCES-MIB::hrStorageSize.3 = INTEGER: 10486420
HOST-RESOURCES-MIB::hrStorageSize.4 = INTEGER: 128449
HOST-RESOURCES-MIB::hrStorageSize.5 = INTEGER: 65531
HOST-RESOURCES-MIB::hrStorageUsed.1 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageUsed.2 = INTEGER: 3538536
HOST-RESOURCES-MIB::hrStorageUsed.3 = INTEGER: 4446042
HOST-RESOURCES-MIB::hrStorageUsed.4 = INTEGER: 43307
HOST-RESOURCES-MIB::hrStorageUsed.5 = INTEGER: 35974
Windows snmp agent for disks is notoriously slow -- try increasing the snmp timeout for that device past 3000 ms.
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
I did that ... It's weird because when I look back at my graphs, they all stopped giving the totals but they all stopped at different times. Some stopped graphing the totals last week and some stopped in march.BSOD2600 wrote:Windows snmp agent for disks is notoriously slow -- try increasing the snmp timeout for that device past 3000 ms.
Trying to connect the dots but I can't find anything or any reason why they would all just randomly stop graphing.
Another thing that is weird is that I deleted all the graphs (by deleting the whole box) for a test box and then added them all again. It's returning "0" for usage on that disk and nan on totals.
rrdtool info ssimsapp00_hdd_used_19.rrd
filename = "ssimsapp00_hdd_used_19.rrd"
rrd_version = "0003"
step = 60
last_update = 1272394503
ds[hdd_used].type = "GAUGE"
ds[hdd_used].minimal_heartbeat = 600
ds[hdd_used].min = 0.0000000000e+00
ds[hdd_used].max = NaN
ds[hdd_used].last_ds = "2839019520"
ds[hdd_used].value = 8.5170585600e+09
ds[hdd_used].unknown_sec = 0
ds[hdd_total].type = "GAUGE"
ds[hdd_total].minimal_heartbeat = 120
ds[hdd_total].min = 0.0000000000e+00
ds[hdd_total].max = NaN
ds[hdd_total].last_ds = "8418033664"
ds[hdd_total].value = NaN
ds[hdd_total].unknown_sec = 3
Should I increase the heartbeat to 600 on the total?
filename = "ssimsapp00_hdd_used_19.rrd"
rrd_version = "0003"
step = 60
last_update = 1272394503
ds[hdd_used].type = "GAUGE"
ds[hdd_used].minimal_heartbeat = 600
ds[hdd_used].min = 0.0000000000e+00
ds[hdd_used].max = NaN
ds[hdd_used].last_ds = "2839019520"
ds[hdd_used].value = 8.5170585600e+09
ds[hdd_used].unknown_sec = 0
ds[hdd_total].type = "GAUGE"
ds[hdd_total].minimal_heartbeat = 120
ds[hdd_total].min = 0.0000000000e+00
ds[hdd_total].max = NaN
ds[hdd_total].last_ds = "8418033664"
ds[hdd_total].value = NaN
ds[hdd_total].unknown_sec = 3
Should I increase the heartbeat to 600 on the total?
Whats the device timeout set to? Make it higher.
Using spine or cmd? try going back to cmd.php.
Using spine or cmd? try going back to cmd.php.
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
04/27/2010 06:56:56 PM - SYSTEM STATS: Time:4.2404 Method:cmd.php Processes:1 Threads:N/A Hosts:13 HostsPerProcess:13 DataSourBSOD2600 wrote:Whats the device timeout set to? Make it higher.
Using spine or cmd? try going back to cmd.php.
ces:142 RRDsProcessed:84
SNMP Timeout = 5000
Is there a different "Device Timeout" I can set or is that the one you are talking about? I set it to 5000 5 hours ago and still same thing ... nan on all "totals" for disks.
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Please test the scripts using the script_server test procedure documented at the docs site to test what is stored in the poller cache. You may have either a bad script, or bad XML file in the resource directory. However, that issue only appears after a misqued upgrade.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Thanks ... here are my results:TheWitness wrote:Please test the scripts using the script_server test procedure documented at the docs site to test what is stored in the poller cache. You may have either a bad script, or bad XML file in the resource directory. However, that issue only appears after a misqued upgrade.
TheWitness
# php script_server.php
PHP Script Server has Started - Parent is cmd
/var/www/html/cacti/scripts/ss_host_disk.php ss_host_disk 192.168.167.229 22 1:161:500:1:10:public:::MD5::DES: get used 2
14544162816
/var/www/html/cacti/scripts/ss_host_disk.php ss_host_disk 192.168.167.229 22 1:161:500:1:10:public:::MD5::DES: get total 2
21476171776
04/28/2010 07:00:48 AM - PHPSVR: Poller[0] Maximum runtime of 52 seconds exceeded for the Script Server. Exiting.
(and in regards to upgrades, I haven't upgraded the system in over 6 months and problem started occuring on the first server around 3 months ago ... now they all return nan ... each started reporting nan at different times/dates)
Post the graph debug output.
You been messing with the rrdtool db sizes? The minimal_heartbeat is different for each DS...
You been messing with the rrdtool db sizes? The minimal_heartbeat is different for each DS...
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
I didn't change any db sizes knowingly. The only tweaking I have is 1 minute polling RRA's. I followed a post on this forum for setting that up. Everything seemed to work fine for a year. Could that have filled up some database that I can increase the size of?BSOD2600 wrote:Post the graph debug output.
You been messing with the rrdtool db sizes? The minimal_heartbeat is different for each DS...
debug on graph:
RRDTool Command:
/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-60 \
--title="ssimsapp00 - Used Space - C: Label: Seri" \
--rigid \
--base=1024 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="bytes" \
--slope-mode \
--font TITLE:12: \
--font AXIS:8: \
--font LEGEND:10: \
--font UNIT:8: \
DEF:a="/var/www/html/cacti/rra/ssimsapp00_hdd_used_104.rrd":hdd_total:AVERAGE \
DEF:b="/var/www/html/cacti/rra/ssimsapp00_hdd_used_104.rrd":hdd_used:AVERAGE \
AREA:a#002A97FF:"Total" \
GPRINTLAST:"Current\:%8.2lf %s" \
GPRINTAVERAGE:"Average\:%8.2lf %s" \
GPRINTMAX:"Maximum\:%8.2lf %s\n" \
AREA:b#F51D30FF:"Used" \
GPRINTLAST:" Current\:%8.2lf %s" \
GPRINTAVERAGE:"Average\:%8.2lf %s" \
GPRINTMAX:"Maximum\:%8.2lf %s\n"
RRDTool Says:
OK
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Your max value in the RRDfile is likely clipping the Total column. Try using and seing if that's the case. If it is, use the "tune" command to correct that behavior.
TheWitness
Code: Select all
rrdtool info <rrdfile>
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Sorry, I'm pretty much a rookie here ...
If you could point me somewhere to read up on "clipping", I will ... otherwise here is my output. I can see "max=NaN" but I don't know what to tune it to (I could make it the disk size but shouldn't it be pulling that info on it's own? I don't want to have to set all the disks manually)
Thanks for all the help again ...
#rrdtool info ssimsapp00_hdd_used_104.rrd
filename = "ssimsapp00_hdd_used_104.rrd"
rrd_version = "0003"
step = 60
last_update = 1272493208
ds[hdd_used].type = "GAUGE"
ds[hdd_used].minimal_heartbeat = 600
ds[hdd_used].min = 0.0000000000e+00
ds[hdd_used].max = NaN
ds[hdd_used].last_ds = "14544171008"
ds[hdd_used].value = 1.1635336806e+11
ds[hdd_used].unknown_sec = 0
ds[hdd_total].type = "GAUGE"
ds[hdd_total].minimal_heartbeat = 120
ds[hdd_total].min = 0.0000000000e+00
ds[hdd_total].max = NaN
ds[hdd_total].last_ds = "21476171776"
ds[hdd_total].value = NaN
ds[hdd_total].unknown_sec = 8
If you could point me somewhere to read up on "clipping", I will ... otherwise here is my output. I can see "max=NaN" but I don't know what to tune it to (I could make it the disk size but shouldn't it be pulling that info on it's own? I don't want to have to set all the disks manually)
Thanks for all the help again ...
#rrdtool info ssimsapp00_hdd_used_104.rrd
filename = "ssimsapp00_hdd_used_104.rrd"
rrd_version = "0003"
step = 60
last_update = 1272493208
ds[hdd_used].type = "GAUGE"
ds[hdd_used].minimal_heartbeat = 600
ds[hdd_used].min = 0.0000000000e+00
ds[hdd_used].max = NaN
ds[hdd_used].last_ds = "14544171008"
ds[hdd_used].value = 1.1635336806e+11
ds[hdd_used].unknown_sec = 0
ds[hdd_total].type = "GAUGE"
ds[hdd_total].minimal_heartbeat = 120
ds[hdd_total].min = 0.0000000000e+00
ds[hdd_total].max = NaN
ds[hdd_total].last_ds = "21476171776"
ds[hdd_total].value = NaN
ds[hdd_total].unknown_sec = 8
Who is online
Users browsing this forum: No registered users and 1 guest