Disk Graphs show totals as "nan" now

Post support questions that relate to the Windows 2003/2000/XP operating systems.

Moderators: Developers, Moderators

trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Disk Graphs show totals as "nan" now

Post by trent1980 »

I just noticed that all of my disk totals (total current, total average, and total maximum) now show "nan". The used are still reporting accurately.

I haven't done any updates so I can't think of any reason why this started.

I also have the realtime plugin installed and it accurately gets the totals but nothing shows up on my old graphs

I hadn't been paying attention to totals as we use nagios for monitoring but this is kind of bothersome and would like to know if there is a way to figure out why it's not graphing.

Thanks in advance
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

snmp walk outputs:

HOST-RESOURCES-MIB::hrStorageDescr.1 = STRING: A:\
HOST-RESOURCES-MIB::hrStorageDescr.2 = STRING: C:\ Label: Serial Number 9a29de6c
HOST-RESOURCES-MIB::hrStorageDescr.3 = STRING: E:\ Label:data Serial Number ba8ef2c9
HOST-RESOURCES-MIB::hrStorageDescr.4 = STRING: Virtual Memory
HOST-RESOURCES-MIB::hrStorageDescr.5 = STRING: Physical Memory
HOST-RESOURCES-MIB::hrStorageAllocationUnits.1 = INTEGER: 0 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.2 = INTEGER: 4096 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.3 = INTEGER: 4096 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.4 = INTEGER: 65536 Bytes
HOST-RESOURCES-MIB::hrStorageAllocationUnits.5 = INTEGER: 65536 Bytes
HOST-RESOURCES-MIB::hrStorageSize.1 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageSize.2 = INTEGER: 5243206
HOST-RESOURCES-MIB::hrStorageSize.3 = INTEGER: 10486420
HOST-RESOURCES-MIB::hrStorageSize.4 = INTEGER: 128449
HOST-RESOURCES-MIB::hrStorageSize.5 = INTEGER: 65531
HOST-RESOURCES-MIB::hrStorageUsed.1 = INTEGER: 0
HOST-RESOURCES-MIB::hrStorageUsed.2 = INTEGER: 3538536
HOST-RESOURCES-MIB::hrStorageUsed.3 = INTEGER: 4446042
HOST-RESOURCES-MIB::hrStorageUsed.4 = INTEGER: 43307
HOST-RESOURCES-MIB::hrStorageUsed.5 = INTEGER: 35974
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

Windows snmp agent for disks is notoriously slow -- try increasing the snmp timeout for that device past 3000 ms.
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

BSOD2600 wrote:Windows snmp agent for disks is notoriously slow -- try increasing the snmp timeout for that device past 3000 ms.
I did that ... It's weird because when I look back at my graphs, they all stopped giving the totals but they all stopped at different times. Some stopped graphing the totals last week and some stopped in march.

Trying to connect the dots but I can't find anything or any reason why they would all just randomly stop graphing.

Another thing that is weird is that I deleted all the graphs (by deleting the whole box) for a test box and then added them all again. It's returning "0" for usage on that disk and nan on totals.
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

rrdtool info ssimsapp00_hdd_used_19.rrd


filename = "ssimsapp00_hdd_used_19.rrd"
rrd_version = "0003"
step = 60
last_update = 1272394503
ds[hdd_used].type = "GAUGE"
ds[hdd_used].minimal_heartbeat = 600
ds[hdd_used].min = 0.0000000000e+00
ds[hdd_used].max = NaN
ds[hdd_used].last_ds = "2839019520"
ds[hdd_used].value = 8.5170585600e+09
ds[hdd_used].unknown_sec = 0
ds[hdd_total].type = "GAUGE"
ds[hdd_total].minimal_heartbeat = 120
ds[hdd_total].min = 0.0000000000e+00
ds[hdd_total].max = NaN
ds[hdd_total].last_ds = "8418033664"
ds[hdd_total].value = NaN
ds[hdd_total].unknown_sec = 3


Should I increase the heartbeat to 600 on the total?
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

I erased the .rrd's associated with that box and re-added it like it was a brand new box.

The "used" are correct but all the "total" are still showing "nan" even for a brand new one ...

snmpwalk and realtime monitor all return the right values ... frustrating
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

Whats the device timeout set to? Make it higher.
Using spine or cmd? try going back to cmd.php.
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

BSOD2600 wrote:Whats the device timeout set to? Make it higher.
Using spine or cmd? try going back to cmd.php.
04/27/2010 06:56:56 PM - SYSTEM STATS: Time:4.2404 Method:cmd.php Processes:1 Threads:N/A Hosts:13 HostsPerProcess:13 DataSour
ces:142 RRDsProcessed:84

SNMP Timeout = 5000

Is there a different "Device Timeout" I can set or is that the one you are talking about? I set it to 5000 5 hours ago and still same thing ... nan on all "totals" for disks.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Please test the scripts using the script_server test procedure documented at the docs site to test what is stored in the poller cache. You may have either a bad script, or bad XML file in the resource directory. However, that issue only appears after a misqued upgrade.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

TheWitness wrote:Please test the scripts using the script_server test procedure documented at the docs site to test what is stored in the poller cache. You may have either a bad script, or bad XML file in the resource directory. However, that issue only appears after a misqued upgrade.

TheWitness
Thanks ... here are my results:

# php script_server.php

PHP Script Server has Started - Parent is cmd
/var/www/html/cacti/scripts/ss_host_disk.php ss_host_disk 192.168.167.229 22 1:161:500:1:10:public:::MD5::DES: get used 2
14544162816
/var/www/html/cacti/scripts/ss_host_disk.php ss_host_disk 192.168.167.229 22 1:161:500:1:10:public:::MD5::DES: get total 2
21476171776
04/28/2010 07:00:48 AM - PHPSVR: Poller[0] Maximum runtime of 52 seconds exceeded for the Script Server. Exiting.


(and in regards to upgrades, I haven't upgraded the system in over 6 months and problem started occuring on the first server around 3 months ago ... now they all return nan ... each started reporting nan at different times/dates)
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

It seems that after I ran the script manually, the total (20gb) is now in the total on the graph but it isn't graphing (in blue) the total.

Attached a picture to show what I mean
Attachments
not showing nan but not graphing total
not showing nan but not graphing total
201004180844-ssimsapp00.PNG (19.21 KiB) Viewed 4998 times
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

Post the graph debug output.

You been messing with the rrdtool db sizes? The minimal_heartbeat is different for each DS...
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

BSOD2600 wrote:Post the graph debug output.

You been messing with the rrdtool db sizes? The minimal_heartbeat is different for each DS...
I didn't change any db sizes knowingly. The only tweaking I have is 1 minute polling RRA's. I followed a post on this forum for setting that up. Everything seemed to work fine for a year. Could that have filled up some database that I can increase the size of?


debug on graph:

RRDTool Command:

/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-60 \
--title="ssimsapp00 - Used Space - C: Label: Seri" \
--rigid \
--base=1024 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="bytes" \
--slope-mode \
--font TITLE:12: \
--font AXIS:8: \
--font LEGEND:10: \
--font UNIT:8: \
DEF:a="/var/www/html/cacti/rra/ssimsapp00_hdd_used_104.rrd":hdd_total:AVERAGE \
DEF:b="/var/www/html/cacti/rra/ssimsapp00_hdd_used_104.rrd":hdd_used:AVERAGE \
AREA:a#002A97FF:"Total" \
GPRINT:a:LAST:"Current\:%8.2lf %s" \
GPRINT:a:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:a:MAX:"Maximum\:%8.2lf %s\n" \
AREA:b#F51D30FF:"Used" \
GPRINT:b:LAST:" Current\:%8.2lf %s" \
GPRINT:b:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:b:MAX:"Maximum\:%8.2lf %s\n"


RRDTool Says:

OK
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Your max value in the RRDfile is likely clipping the Total column. Try using

Code: Select all

rrdtool info <rrdfile>
and seing if that's the case. If it is, use the "tune" command to correct that behavior.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
trent1980
Posts: 21
Joined: Fri Apr 10, 2009 3:08 pm

Post by trent1980 »

Sorry, I'm pretty much a rookie here ...

If you could point me somewhere to read up on "clipping", I will ... otherwise here is my output. I can see "max=NaN" but I don't know what to tune it to (I could make it the disk size but shouldn't it be pulling that info on it's own? I don't want to have to set all the disks manually)

Thanks for all the help again ...

#rrdtool info ssimsapp00_hdd_used_104.rrd


filename = "ssimsapp00_hdd_used_104.rrd"
rrd_version = "0003"
step = 60
last_update = 1272493208
ds[hdd_used].type = "GAUGE"
ds[hdd_used].minimal_heartbeat = 600
ds[hdd_used].min = 0.0000000000e+00
ds[hdd_used].max = NaN
ds[hdd_used].last_ds = "14544171008"
ds[hdd_used].value = 1.1635336806e+11
ds[hdd_used].unknown_sec = 0
ds[hdd_total].type = "GAUGE"
ds[hdd_total].minimal_heartbeat = 120
ds[hdd_total].min = 0.0000000000e+00
ds[hdd_total].max = NaN
ds[hdd_total].last_ds = "21476171776"
ds[hdd_total].value = NaN
ds[hdd_total].unknown_sec = 8
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest