Problems with Poller on 0.

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
alinux
Cacti User
Posts: 56
Joined: Mon Jul 16, 2007 10:02 am

Problems with Poller on 0.

Post by alinux »

Hi
I have Version 0.8.6h running and I wanted to test out 0.8.7b. So I installed it on the same server along side my existing cacti installation. My existing installation runs fine. However the new installation is not running and graphing..in the logs I found the below.

Please help.

Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.

09/26/2008 08:24:34 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
09/26/2008 08:24:34 AM - SYSTEM STATS: Time:299.3425 Method:cmd.php Processes:1 Threads:N/A Hosts:4 HostsPerProcess:4 DataSources:23 RRDsProcessed:0

Warning: pclose(): 70 is not a valid stream resource in /usr/local/apache2/htdocs/cacti_new/lib/rrd.php on line 57
Loop Time is: 299.344939947
Sleep Time is: 0.650599956512
Total Time is: 299.349400043
You have new mail in /var/spool/mail/root
rtorti19
Posts: 48
Joined: Wed May 07, 2008 1:20 pm

Post by rtorti19 »

Have you completely separated your 2 installations? i.e. separate users, db's, paths, cronjobs, etc.?
alinux
Cacti User
Posts: 56
Joined: Mon Jul 16, 2007 10:02 am

Post by alinux »

Hi
Yes I have totally seperated the two installations. Any hints please ?
fireman949
Posts: 11
Joined: Fri Apr 14, 2006 10:11 am
Location: MS

Post by fireman949 »

I have the exact same issue only mine seems to have broken without warning.

In my poller log:

Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.

in my debug log :

11/01/2008 03:18:33 PM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select poller_id,end_time from poller_time where poller_id=0"
11/01/2008 03:18:33 PM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"

Using spine or cmd.php makes no difference (other than spine beats the crap out of the box)

Eventually hits the max runtime and starts over again. Running 0.8.7b on a 64bit CentOS 5 install. I'd be glad to try anything or offer up more information.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

The SQL is quite normal (busy waiting). Your assumption may be correct. Run in DEBUG mode and/or from command line for more
Reinhard
fireman949
Posts: 11
Joined: Fri Apr 14, 2006 10:11 am
Location: MS

Post by fireman949 »

Is it possible something is wrong with the MySQL server config? I had some indications that it had too many connections about a week ago.

I did run in debug from the command line and saw the same behavior.

Cacti is configured to poll every minute via cron.

Using cmd.php with 5 concurrent processes it appears to poll normally for about 30 seconds and for the last 30 seconds the exact same thing is logged every second repeating:

poller.log (added to the cron):

Code: Select all

Waiting on 1/5 pollers.
Waiting on 1/5 pollers.
Waiting on 1/5 pol  (repeats until )
11/03/2008 06:40:00 PM - POLLER: Poller[0] Maximum runtime of 58 seconds exceeded. Exiting.
11/03/2008 06:40:00 PM - SYSTEM STATS: Time:58.7843 Method:cmd.php Processes:5 Threads:N/A Hosts:47 HostsPerProcess:10 DataSources:1919 RRDsProcessed:1093
PHP Warning:  pclose(): 78 is not a valid stream resource in /var/www/cacti/lib/rrd.php on line 57
at which point it starts over.


Cacti log exhibits what I would call normal behavior until the same spot (about 30 seconds in):

Code: Select all

11/03/2008 06:41:30 PM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select poller_id,end_time from poller_time where poller_id=0"
11/03/2008 06:41:30 PM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select  poller_output.output,  poller_output.time,  poller_output.local_data_id,  poller_item.rrd_path,  poller_item.rrd_name,  poller_item.rrd_num  from (poller_output,poller_item)  where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name)  LIMIT 10000"
This repeats for 30 seconds until it times out and starts the polling process again.

I do have some defunct processes from 10 minutes ago:

Code: Select all

cacti    24147 24145  0 18:35 ?        00:00:02 [php] <defunct>
cacti    24184     1  2 18:36 ?        00:00:10 /usr/bin/php -q /var/www/cacti/cmd.php 0 58
cacti    24186 24184  0 18:36 ?        00:00:02 [php] <defunct>
And I do have some script_server.php cmd processes from a few days ago. I stopped the poller and killed off all cacti processes but the same behavior is happening. Some data is captured but there are 'gappy' graphs.

I'm not sure which direction to go as a fresh cacti install I setup to test with polling some of the same switches looks fine.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Sometimes the default no# of connections is too low. Increase and restart mysql
Reinhard
fireman949
Posts: 11
Joined: Fri Apr 14, 2006 10:11 am
Location: MS

Post by fireman949 »

gandalf wrote:Sometimes the default no# of connections is too low. Increase and restart mysql
Reinhard

Connections set to 500. Still a problem.

I turned on mysql query logging and here is what I see repeated over and over again for 30 seconds (when I see Waiting on 1/1 pollers.). This is 30 seconds into what appears to be a normal run of the poller and continues for 30 seconds until the run time is exceeded.

Code: Select all

081111 20:46:55	      2 Query       select poller_id,end_time from poller_time where poller_id=0
		      2 Query       select  poller_output.output,  poller_output.time,  poller_output.local_data_id,  poller_item.rrd_path,  poller_item.rrd_name,  poller_item.rrd_num  from (poller_output,poller_item)  where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name)  LIMIT 10000

This corresponds to the Waiting on 1/1 pollers in the poller.log.

I have quite a few processes (181+) hanging out there:

/usr/bin/php -q /var/www/cacti/cmd.php 21 31
/usr/bin/php -q /var/www/cacti/script_server.php cmd
[php] <defunct>

These were from Nov3 and Nov4 so I killed them all. I am back to using spine from 0.8.7c beta2 as that provides the most consistent results even though graphs are still very choppy.

Why would it repeat the exact same mysql query over and over for 30 seconds until it times out?

I'll add some more info, hopefully not too much.

Info from one run:
cacti.log:

Code: Select all

11/11/2008 08:58:14 PM - SPINE: Poller[0] Host[24] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
11/11/2008 08:58:14 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 1
11/11/2008 08:59:01 PM - CMDPHP: Poller[0] DEBUG: SQL Cell: "select count(*) from poller_item where rrd_next_step<=0"
11/11/2008 08:59:01 PM - POLLER: Poller[0] NOTE: Poller Int: '60', Cron Int: '60', Time Since Last: '60', Max Runtime '58', Poller Runs: '1'
11/11/2008 08:59:01 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "replace into settings (name,value) values ('poller_lastrun',1226458741)"
11/11/2008 08:59:01 PM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select id from host where disabled = '' order by id"
11/11/2008 08:59:01 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "replace into settings (name,value) values ('path_webroot','/var/www/cacti')"
11/11/2008 08:59:01 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "TRUNCATE TABLE poller_time"
11/11/2008 08:59:01 PM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "SELECT local_data_id, rrd_name FROM poller_output"
11/11/2008 08:59:01 PM - SPINE: Poller[0] ERROR: Spine Timed Out While Processing Hosts Internal
poller.log

Code: Select all

11/11/2008 08:59:00 PM - POLLER: Poller[0] Maximum runtime of 58 seconds exceeded. Exiting.
11/11/2008 08:59:00 PM - SYSTEM STATS: Time:58.4323 Method:spine Processes:1 Threads:30 Hosts:47 HostsPerProcess:47 DataSources:1921 RRDsProcessed:1007
PHP Warning:  pclose(): 63 is not a valid stream resource in /var/www/cacti/lib/rrd.php on line 57
mysql

Code: Select all

mysql> select poller_id,end_time from poller_time where poller_id=0 ;
+-----------+---------------------+
| poller_id | end_time            |
+-----------+---------------------+
|         0 | 2008-11-11 20:58:01 | 
+-----------+---------------------+
Shouldn't the end_time be later than the actual time? It seems like the end time is when the poller processes start but what do I know..

Sorry for spewing so much info, but I think this is all related and could be useful in pinpointing my problem.
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

If you run the following:

Code: Select all

./spine -V 3 -R
How long does it run?

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
fireman949
Posts: 11
Joined: Fri Apr 14, 2006 10:11 am
Location: MS

Post by fireman949 »

9 seconds.

This was probably not the best idea on my part, but I had quite a few linux hosts graphed for performance on the same cacti server we use for critical switch data. After disabling all the linux hosts, the spine poller finished in 9 seconds. I could probably tune it and get better performance.

I'm going to migrate all my linux hosts over to a non-critical cacti install and find out which data source is causing the problems.

Everything looks good now and the spine 0.8.7c beta 2 that I built works nicely.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest