Graphs have stopped generating
Moderators: Developers, Moderators
Graphs have stopped generating
Hi All
I've tried running the poller from the command line and with "php -q /var/www/cacti/poller.php --force --debug" as the www-data user (who I have chowned the whole of /var/www/cacti/ with), I get:
12/06/2011 06:35:41 PM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
12/06/2011 06:35:41 PM - SYSTEM STATS: Time:298.8314 Method:spine Processes:5 Threads:6 Hosts:9 HostsPerProcess:2 DataSources:36 RRDsProcessed:0
Loop Time is: 298.83
Sleep Time is: 1.16
Total Time is: 298.84
The Cacti log at debug level, shows lots of:
12/07/2011 12:40:00 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 7 9]
12/07/2011 12:40:00 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 3 6]
12/07/2011 12:40:00 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 0 2]
12/07/2011 12:40:00 PM - POLLER: Poller[0] NOTE: Poller Int: '300', Cron Int: '300', Time Since Last: '299', Max Runtime '298', Poller Runs: '1'
12/07/2011 12:40:00 PM - SYSTEM THOLD STATS: Time:0.0135 Tholds:0 DownHosts:0
12/07/2011 12:40:00 PM - SYSTEM STATS: Time:298.6647 Method:spine Processes:5 Threads:6 Hosts:9 HostsPerProcess:2 DataSources:36 RRDsProcessed:0
12/07/2011 12:40:00 PM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
12/07/2011 12:35:01 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 7 9]
12/07/2011 12:35:01 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 3 6]
12/07/2011 12:35:01 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 0 2]
It has to be a permissions thing because if I do as root it works.. Yet I've done "chown -Rv www-data:www-data /var/www/cacti/*".
Done as the www-data user as this is running on Ubuntu 10.04 under Apache2. Cronjob also set to run as this user. It was all working. I was doing things like adding new devices and some templates, and boom. It's stopped working. Confused.
Thanks
Michelle
I've tried running the poller from the command line and with "php -q /var/www/cacti/poller.php --force --debug" as the www-data user (who I have chowned the whole of /var/www/cacti/ with), I get:
12/06/2011 06:35:41 PM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
12/06/2011 06:35:41 PM - SYSTEM STATS: Time:298.8314 Method:spine Processes:5 Threads:6 Hosts:9 HostsPerProcess:2 DataSources:36 RRDsProcessed:0
Loop Time is: 298.83
Sleep Time is: 1.16
Total Time is: 298.84
The Cacti log at debug level, shows lots of:
12/07/2011 12:40:00 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 7 9]
12/07/2011 12:40:00 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 3 6]
12/07/2011 12:40:00 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 0 2]
12/07/2011 12:40:00 PM - POLLER: Poller[0] NOTE: Poller Int: '300', Cron Int: '300', Time Since Last: '299', Max Runtime '298', Poller Runs: '1'
12/07/2011 12:40:00 PM - SYSTEM THOLD STATS: Time:0.0135 Tholds:0 DownHosts:0
12/07/2011 12:40:00 PM - SYSTEM STATS: Time:298.6647 Method:spine Processes:5 Threads:6 Hosts:9 HostsPerProcess:2 DataSources:36 RRDsProcessed:0
12/07/2011 12:40:00 PM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
12/07/2011 12:35:01 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 7 9]
12/07/2011 12:35:01 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 3 6]
12/07/2011 12:35:01 PM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /usr/sbin/spine, ARGS: 0 2]
It has to be a permissions thing because if I do as root it works.. Yet I've done "chown -Rv www-data:www-data /var/www/cacti/*".
Done as the www-data user as this is running on Ubuntu 10.04 under Apache2. Cronjob also set to run as this user. It was all working. I was doing things like adding new devices and some templates, and boom. It's stopped working. Confused.
Thanks
Michelle
Re: Graphs have stopped generating
Also spotted via top that when the cronjob is running that there is a PHP process at about 30% CPU usage running as the www-data user.
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Graphs have stopped generating
Please make sure that spine is running well under the command line first. Run it in 'read-only' mode to prevent any issues. Also, please run the latest SVN (branches/0.8.7) as the 'h' release had some issues.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Graphs have stopped generating
It's happened again. I spend ages scratching my head, gave up and posted. Then quickly work out the problem!
Permissions on /etc/cacti/ were out so it could not read the spine.conf - I worked this out after running that debug mode of the php poller but channelling the output to a text file. I then spotted a few lines right at the top with a the clue that were scrolling off the screen instantly when just done from the console as it flooded the screen with the likes of "Waiting on 3 of 3 pollers." for a while before showing what I pasted (sorry, I should have mentioned that as well).
I'll look at getting the latest SVN down - I've already ran across the time display/span issue bug
Permissions on /etc/cacti/ were out so it could not read the spine.conf - I worked this out after running that debug mode of the php poller but channelling the output to a text file. I then spotted a few lines right at the top with a the clue that were scrolling off the screen instantly when just done from the console as it flooded the screen with the likes of "Waiting on 3 of 3 pollers." for a while before showing what I pasted (sorry, I should have mentioned that as well).
I'll look at getting the latest SVN down - I've already ran across the time display/span issue bug
Re: Graphs have stopped generating
Not all as good as I thought, some graphs but not all of them, are just blank with lots of NaNs, unless I expand the time span to when it was working before all this. Yet things like Thold are still working on some (it does correctly show the current value in the 'Current' column on the thold page), alerting me when values drop below/above alert levels, so it is reading the values. Some don't have current values though. I've rebuilt the poller cache, no change.
Plenty are working fine!
I've checked the 'Data Source Debug' - no errors.
'Graph Debug Mode' - No errors: 'RRDTool Says: OK'
Look at the Cacti log (still on debug!). No errors.
Upon looking at the rra directory, I can see the files last modified time stamps are correct, in line with the working graphs.
Looking at the types that are not working. All my ones using the 'Advanced Ping ALT' template/script are not. Then a few others that are only being used once but with related ones working fine (like just some router statistics but not all of them, from the same template).
Plenty are working fine!
I've checked the 'Data Source Debug' - no errors.
'Graph Debug Mode' - No errors: 'RRDTool Says: OK'
Look at the Cacti log (still on debug!). No errors.
Upon looking at the rra directory, I can see the files last modified time stamps are correct, in line with the working graphs.
Looking at the types that are not working. All my ones using the 'Advanced Ping ALT' template/script are not. Then a few others that are only being used once but with related ones working fine (like just some router statistics but not all of them, from the same template).
Re: Graphs have stopped generating
It's definitely pulling the data ok looking through the logs:
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'min:26.1200' [map min->min]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'avg:27.8240' [map avg->avg]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'max:29.0300' [map max->max]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'dev:0.7031' [map dev->dev]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'loss:0.0000' [map loss->loss]
12/08/2011 02:45:46 PM - POLLER: Poller[0] CACTI2RRD: /usr/bin/rrdtool update /var/www/cacti/rra/www_website_co_uk_dev_43.rrd --template min:avg:max:dev:loss 1323355545:26.1
OK u:0.01 s:0.00 r:1.32
12/08/2011 02:45:46 PM - POLLER: Poller[0] CACTI2RRD: /usr/bin/rrdtool update /var/www/cacti/rra/adsl_router_down_snr_29.rrd --template down_snr 1323355545:90
OK u:0.01 s:0.00 r:1.46
So we've got the data and the RRD appears to be updating, or at least being processed when it should as the files modified time updates.
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'min:26.1200' [map min->min]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'avg:27.8240' [map avg->avg]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'max:29.0300' [map max->max]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'dev:0.7031' [map dev->dev]
12/08/2011 02:45:46 PM - POLLER: Poller[0] Parsed MULTI output field 'loss:0.0000' [map loss->loss]
12/08/2011 02:45:46 PM - POLLER: Poller[0] CACTI2RRD: /usr/bin/rrdtool update /var/www/cacti/rra/www_website_co_uk_dev_43.rrd --template min:avg:max:dev:loss 1323355545:26.1
OK u:0.01 s:0.00 r:1.32
12/08/2011 02:45:46 PM - POLLER: Poller[0] CACTI2RRD: /usr/bin/rrdtool update /var/www/cacti/rra/adsl_router_down_snr_29.rrd --template down_snr 1323355545:90
OK u:0.01 s:0.00 r:1.46
So we've got the data and the RRD appears to be updating, or at least being processed when it should as the files modified time updates.
Re: Graphs have stopped generating
Make the time span for the graph to say a week, the graphs do show the data for before the problem kicked off.
Why isn't it adding data now, even tho it is being read ok?!
Why isn't it adding data now, even tho it is being read ok?!
Re: Graphs have stopped generating
Thanks a lot to any one in advance who comes to my rescue
I've created new graphs with the Advanced Ping ALT template - both on devices never having one and on one device, deleting the one that was not working and adding a new one. Both are working fine.
There must be something corrupt with the current RRD files. What can I do? I really don't want to loose the already collected data please.
Thank you!
Shell
I've created new graphs with the Advanced Ping ALT template - both on devices never having one and on one device, deleting the one that was not working and adding a new one. Both are working fine.
There must be something corrupt with the current RRD files. What can I do? I really don't want to loose the already collected data please.
Thank you!
Shell
Re: Graphs have stopped generating
Using rrdtool dump I've dumped one of the RRD files in question... and see the following in it:
It has the value in last_ds but not value, why?
Code: Select all
<ds>
<name> avg </name>
<type> GAUGE </type>
<minimal_heartbeat> 120 </minimal_heartbeat>
<min> 0.0000000000e+00 </min>
<max> 5.0000000000e+02 </max>
<!-- PDP Status -->
<last_ds> 3.4857 </last_ds>
<value> NaN </value>
<unknown_sec> 2 </unknown_sec>
</ds>
Re: Graphs have stopped generating
Hmm, I've not fixed the Advanced Ping graphs but I have the router graphs. I realised that the poller time I had changed from 1 minute to 5 with the original problem, to see if giving it more time helped. Put it back to 1 now, router SNR/attenuation etc graphs now working
I thought you just had to do 'Rebuild Poller Cache' after changing the poller time from the Utilities menu?
I thought you just had to do 'Rebuild Poller Cache' after changing the poller time from the Utilities menu?
Re: Graphs have stopped generating
I have just this moment upgraded to the latest on SVN btw as per TheWitness's advice. Realised I hadn't!
Re: Graphs have stopped generating
Perhaps that value couldn't be retrieved within heartbeat time since last poll.shelluk wrote:Using rrdtool dump I've dumped one of the RRD files in question... and see the following in it:
It has the value in last_ds but not value, why?Code: Select all
<ds> <name> avg </name> <type> GAUGE </type> <minimal_heartbeat> 120 </minimal_heartbeat> <min> 0.0000000000e+00 </min> <max> 5.0000000000e+02 </max> <!-- PDP Status --> <last_ds> 3.4857 </last_ds> <value> NaN </value> <unknown_sec> 2 </unknown_sec> </ds>
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Graphs have stopped generating
Generally, when you have lossey devices, assuming that's what it is, increasing the heartbeat is a way to insulate against gaps in your graphs. You can do that using the rrdtool tune command.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Who is online
Users browsing this forum: No registered users and 1 guest