Hi All,
Before people start telling me to go through the "Debug NAN Documentation", I already have.
Here is a quick explanation of what is happening, and then will be a list of the troubleshooting steeps I have already taken.
I am running Cacti 0.8.7a and spine 0.8.7b
I have about 190 hosts being polled every 5 min, and it has been working fine for at least 2 months. I add new hosts daily, and yesterday when I added 5 new hosts, 2 of them worked fine, like the other 100+ I have added, but 3 of them just had NaN's on the graph.
I even tried adding hosts that I had already added (with different names), that I knew worked, and still nothing.
I add hosts using templates. The only stats i graph from hosts are Interface In/Out bits (supplied template) and I also run a "smoke ping" perl script I wrote and use a graph template I found here on the site.
All of this works great and it is working on all the other hosts with out a problem. Every host I try to add now does not work. I have not changed anything on the server and as you can see below I have done a lot of troubleshooting to try and sort this out, but have yet to find a solution.
If anyone has any suggestions, please let me know.
Here is what I have done so far:
First off, I went through http://docs.cacti.net/node/283
1.
No Timeouts/Warnings relating to the hosts in question
2.
Data gathering seems to be working fine. I am using supplied the Data Queries for the interface stats and it is working fine on all the other hosts. I have confirmed that SNMP community is correct and that the OID is available on the host.
3.
The Poller is running fine, i am running poller.php.
I changed the logs to debug and captured a few runs, and everything seems to be ok there. The hosts in question report the same lines as the ones that are working fine. (I will attach this log to the post)
4.
I have checked that the update lines in the Debug Log are working fine. To do this, I copied the lines from the logs and pasted them into phpMyAdmin, and checked that they worked fine without any errors.
5.
I have checked to make sure that the RRD Files are being updated. All of the rrd's are being updated, and the ones in question seem to be updating properly too. The timestamp on the rrd files changes, and the info from inside the rrd files also changes.
6.
The RRD files are all owned by the correct user and are all the same (the ones that work and the ones that dont...)
7.
The ds.max and ds.min both seem to be fine. I am not 100% sure how to read them, but I have not changed any settings in the cacti GUI and 3 of the 5 hosts that I added didnt work, even though they were all added within about 2 min of each other.
8.
The graph's seem to be getting the correct values and relate to the correct RRD file. The graphs draw, they just dont display any values.
9.
poller_output is cleared each run in this version of cacti.
I have php memory set to 128mb, so that should not be the problem.
I have looked at the other suggestions at the bottom of that page, and nothing seems to help.
I have attached some related logs etc to the post.
Does anyone know what the problem could be??
Thanks
-Hurgh-
Getting NAN's for newly created hosts [SOLVED MAYBE]
Moderators: Developers, Moderators
Getting NAN's for newly created hosts [SOLVED MAYBE]
- Attachments
-
- rrd_info_notworking.txt
- rrdtool info dump from a RRD file that is not working
- (3.11 KiB) Downloaded 87 times
-
- rrd_info_working.txt
- rrdtool info dump from a RRD file that is working
- (3.09 KiB) Downloaded 103 times
-
- cacti-1run.zip
- Debug log from 1 run of my Cacti.
- (199.04 KiB) Downloaded 74 times
Last edited by hurgh on Tue Apr 01, 2008 12:10 am, edited 1 time in total.
Ok,
I think I might know what the problem is.
It seems that it could be a Daylight Savings issue.
Here in Melbourne Australia, this year our DST was extended for 1 week, but it seems that part of my server knew but the other part didnt.
If you have a look in my log files, any line relating to SPINE are the correct time, where as all other log messages are 1 hour behind (non DST time).
The reason I think this, is because I added some new hosts about 1 hour ago, and now they just started to work.
The system time is correct and so is spine, but the rest of cacti seems to be 1 hour behind, and I have no idea why cacti is behind.
Any help?
-Hurgh-
I think I might know what the problem is.
It seems that it could be a Daylight Savings issue.
Here in Melbourne Australia, this year our DST was extended for 1 week, but it seems that part of my server knew but the other part didnt.
If you have a look in my log files, any line relating to SPINE are the correct time, where as all other log messages are 1 hour behind (non DST time).
The reason I think this, is because I added some new hosts about 1 hour ago, and now they just started to work.
The system time is correct and so is spine, but the rest of cacti seems to be 1 hour behind, and I have no idea why cacti is behind.
Any help?
-Hurgh-
Who is online
Users browsing this forum: No registered users and 1 guest