All graphs stopped updating, showing NAN values

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
mxljosh
Posts: 17
Joined: Fri Aug 09, 2013 4:58 pm

All graphs stopped updating, showing NAN values

Post by mxljosh »

Last week our cacti graphs stopped updating. We are now getting NAN values on over 100 graphs (ALL of them). I've gone through the "Debug NaN's in your graphs" link here and still am not getting any luck. I did create a simple device as a test and configured ucd/net node and was able to poll it's memory, cpu, etc.

http://forums.cacti.net/about15136.html

1. When I set the log file to DEBUG, I don't see anything useful in our logs. No errors. I should mention this is a custom install that a previous employee installed and configured. We are basically just monitoring License servers only. We are also writing to an NFS mount on our file server, not the local /var/www/html/cacti/rrd directory. Here is the only thing I see in the logs with DEBUG enabled:

08/18/2015 11:20:40 AM - WEBLOG: Poller[0] CACTI2RRD: /usr/bin/rrdtool graph - --imgformat=PNG --start=-86400 --end=-300 --title="bda_3x_BDA_TOKEN" --base=1000 --height=240 --width=1000 --alt-autoscale-max --lower-limit=0 --vertical-label="" --slope-mode --font TITLE:12: --font AXIS:8: --font LEGEND:10: --font UNIT:8: DEF:a="/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd":inuse:MAX DEF:b="/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd":capacity:MAX AREA:a#00FF00FF:"inuse" GPRINT:a:LAST:" Current\:%8.0lf" GPRINT:a:MAX:"Maximum\:%8.0lf\n" LINE2:b#FF0000FF:"capacity" GPRINT:b:LAST:"Current\:%8.0lf"
08/18/2015 11:20:40 AM - WEBLOG: Poller[0] CACTI2RRD: /usr/bin/rrdtool graph - --imgformat=PNG --start=-86400 --end=-300 --title="bda_3x_BDA_TOKEN" --base=1000 --height=240 --width=1000 --alt-autoscale-max --lower-limit=0 --vertical-label="" --slope-mode --font TITLE:12: --font AXIS:8: --font LEGEND:10: --font UNIT:8: DEF:a="/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd":inuse:MAX DEF:b="/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd":capacity:MAX AREA:a#00FF00FF:"inuse" GPRINT:a:LAST:" Current\:%8.0lf" GPRINT:a:MAX:"Maximum\:%8.0lf\n" LINE2:b#FF0000FF:"capacity" GPRINT:b:LAST:"Current\:%8.0lf"


2.We are not using SNMP.

3. Debug mode under graph management shows RRDTools says: OK

RRDTool Command:

/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-300 \
--title="bda_3x_BDA_TOKEN" \
--base=1000 \
--height=240 \
--width=1000 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="" \
--slope-mode \
--font TITLE:12: \
--font AXIS:8: \
--font LEGEND:10: \
--font UNIT:8: \
DEF:a="/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd":inuse:MAX \
DEF:b="/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd":capacity:MAX \
AREA:a#00FF00FF:"inuse" \
GPRINT:a:LAST:" Current\:%8.0lf" \
GPRINT:a:MAX:"Maximum\:%8.0lf\n" \
LINE2:b#FF0000FF:"capacity" \
GPRINT:b:LAST:"Current\:%8.0lf"

RRDTool Says:

OK

4. poller_output in mysql shows 0

mysql> select count(*) from poller_output;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)



5. I changed the php.ini memory_limit to 256MB, 512MB, then 1024MB with no luck

6. I did an RRDTOOL fetch and see the nans values after previous data:

rrdtool fetch /tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd MAX
inuse capacity

1439838300: nan nan
1439838600: nan nan
1439838900: nan nan
1439839200: nan nan
1439839500: nan nan
1439839800: nan nan
1439840100: nan nan
1439840400: nan nan

7. I ran an RRDTOOL info and got the following after changing the maximum from 999 to 9999, then 999999 (we aren't hitting the minimum or maximum):

rrdtool info /tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd
filename = "/tools/cb/license/rrd/bda_3x_BDA_TOKEN.rrd"
rrd_version = "0003"
step = 300
last_update = 1439230566
ds[inuse].type = "GAUGE"
ds[inuse].minimal_heartbeat = 600
ds[inuse].min = 0.0000000000e+00
ds[inuse].max = 9.9900000000e+02
ds[inuse].last_ds = "24"
ds[inuse].value = 1.5931428960e+03
ds[inuse].unknown_sec = 0
ds[capacity].type = "GAUGE"
ds[capacity].minimal_heartbeat = 600
ds[capacity].min = 0.0000000000e+00
ds[capacity].max = 9.9900000000e+02
ds[capacity].last_ds = "60"
ds[capacity].value = 3.9828572400e+03
ds[capacity].unknown_sec = 0
rra[0].cf = "MAX"
rra[0].rows = 210379
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[0].cdp_prep[1].value = NaN
rra[0].cdp_prep[1].unknown_datapoints = 0

8. I confirmed I can write to the NFS mount as the directory is 777 now (we are on private network, so not overly concerned about security of permissions)

9. This is NOT an rpm install and only one crontab exists:

crontab -l
*/5 * * * * cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1

System details:
Centos 5.5
php 5.1.6
mysql 5.0.95
cacti version 0.8.7e
rrdtool 1.2.x


Again, this has been running for years without issues. No changes were made that I am aware of. I am running out of ideas. Any help is greatly appreciated!!
Attachments
This is the one week view after it stopped working showing NaN's.
This is the one week view after it stopped working showing NaN's.
Capture2.PNG (27.1 KiB) Viewed 2622 times
This is the 2 week view showing it was working.
This is the 2 week view showing it was working.
Capture1.PNG (39.83 KiB) Viewed 2622 times
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: All graphs stopped updating, showing NAN values

Post by BSOD2600 »

That's quite the ancient system ;) Cacti 0.8.8f is the current release, btw.

Change the cacti.log logging level back down to low or medium. I'd stop logging the WEBLOG entries too. Now, is the poller running every 5 minutes? Any errors?
mxljosh
Posts: 17
Joined: Fri Aug 09, 2013 4:58 pm

Re: All graphs stopped updating, showing NAN values

Post by mxljosh »

Yes, I inherited this install and it's pretty customized as well as important to the company. I didn't want to touch it unless I had to. Guess now is that time. So I changed the logging to medium and disabled the WEBLOG entries and nothing shows up in the logs unless I "su - cactiuser" and then run "/usr/bin/php /var/www/html/cacti/log/poller.php" but I do see that the poller is being run every 5 minutes via cron:

Aug 18 13:35:01 cacti crond[4386]: (root) CMD (cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1)
Aug 18 13:40:01 cacti crond[4405]: (root) CMD (cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1)
Aug 18 13:45:01 cacti crond[4424]: (root) CMD (cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1)
Aug 18 13:50:01 cacti crond[4443]: (root) CMD (cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1)
Aug 18 13:55:01 cacti crond[4465]: (root) CMD (cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1)
Aug 18 14:00:01 cacti crond[4484]: (root) CMD (cactiuser /usr/bin/php /var/www/html/cacti/poller.php > /dev/null 2>&1)

I am really at a loss here. Any ideas/input is greatly appreciated! I really doubt upgrading to the latest version will fix this issue, but if that is what is recommended I'll take a snapshot and give that a try.
mxljosh
Posts: 17
Joined: Fri Aug 09, 2013 4:58 pm

Re: All graphs stopped updating, showing NAN values

Post by mxljosh »

Ok, I took a snapshot of the VM and upgraded to the latest version. I am having the same problem. Here is something odd. When I click on the "system utilities" > "technical support" it only shows 18 data sources (I Just added 2 devices with 9 data sources each). It almost as if it's not seeing these other 1700 that exist. I'm really losing my mind on this one!!!
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: All graphs stopped updating, showing NAN values

Post by BSOD2600 »

see if there are any db issues and run: /var/www/html/cacti/cli/repair_database.php
SELinux permission issue?

if you manually run usr/bin/php /var/www/html/cacti/poller.php, you should see the output on the console. does it run? errors?
mxljosh
Posts: 17
Joined: Fri Aug 09, 2013 4:58 pm

Re: All graphs stopped updating, showing NAN values

Post by mxljosh »

Thanks for the reply. SElinux is disabled.

I ran the repair db script and poller manually:

[root@cacti ~]# /usr/bin/php /var/www/html/cacti/cli/repair_database.php
Repairing All Cacti Database Tables
Repairing Table -> 'cdef' Successful
Repairing Table -> 'cdef_items' Successful
Repairing Table -> 'colors' Successful
Repairing Table -> 'data_input' Successful
Repairing Table -> 'data_input_data' Successful
Repairing Table -> 'data_input_fields' Successful
Repairing Table -> 'data_local' Successful
Repairing Table -> 'data_template' Successful
Repairing Table -> 'data_template_data' Successful
Repairing Table -> 'data_template_data_rra' Successful
Repairing Table -> 'data_template_rrd' Successful
Repairing Table -> 'graph_local' Successful
Repairing Table -> 'graph_template_input' Successful
Repairing Table -> 'graph_template_input_defs' Successful
Repairing Table -> 'graph_templates' Successful
Repairing Table -> 'graph_templates_gprint' Successful
Repairing Table -> 'graph_templates_graph' Successful
Repairing Table -> 'graph_templates_item' Successful
Repairing Table -> 'graph_tree' Successful
Repairing Table -> 'graph_tree_items' Successful
Repairing Table -> 'host' Successful
Repairing Table -> 'host_graph' Successful
Repairing Table -> 'host_snmp_cache' Successful
Repairing Table -> 'host_snmp_query' Successful
Repairing Table -> 'host_template' Successful
Repairing Table -> 'host_template_graph' Successful
Repairing Table -> 'host_template_snmp_query' Successful
Repairing Table -> 'poller' Successful
Repairing Table -> 'poller_command' Successful
Repairing Table -> 'poller_item' Successful
Repairing Table -> 'poller_output' Successful
Repairing Table -> 'poller_reindex' Successful
Repairing Table -> 'poller_time' Successful
Repairing Table -> 'rra' Successful
Repairing Table -> 'rra_cf' Successful
Repairing Table -> 'settings' Successful
Repairing Table -> 'settings_graphs' Successful
Repairing Table -> 'settings_tree' Successful
Repairing Table -> 'snmp_query' Successful
Repairing Table -> 'snmp_query_graph' Successful
Repairing Table -> 'snmp_query_graph_rrd' Successful
Repairing Table -> 'snmp_query_graph_rrd_sv' Successful
Repairing Table -> 'snmp_query_graph_sv' Successful
Repairing Table -> 'user_auth' Successful
Repairing Table -> 'user_auth_perms' Successful
Repairing Table -> 'user_auth_realm' Successful
Repairing Table -> 'user_log' Successful
Repairing Table -> 'version' Successful
[root@cacti ~]#

[root@cacti ~]# /usr/bin/php /var/www/html/cacti/poller.php
08/19/2015 09:07:57 AM - POLLER: Poller[0] NOTE: Poller Int: '300', Cron Int: '300', Time Since Last: '176', Max Runtime '298', Poller Runs: '1'
08/19/2015 09:07:57 AM - POLLER: Poller[0] NOTE: Cron is configured to run too often! The Poller Interval is '300' seconds, with a minimum Cron period of '300' seconds, but only 176 seconds have passed since the poller last ran.


Still no updates to the RRD's.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: All graphs stopped updating, showing NAN values

Post by BSOD2600 »

Moving to the linux forum for more expert troubleshooting...
mxljosh
Posts: 17
Joined: Fri Aug 09, 2013 4:58 pm

Re: All graphs stopped updating, showing NAN values

Post by mxljosh »

Help!! :(
mxljosh
Posts: 17
Joined: Fri Aug 09, 2013 4:58 pm

Re: All graphs stopped updating, showing NAN values

Post by mxljosh »

Anybody?! :cry:
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests