cacti stop polling/generating graphs for new host

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

cacti stop polling/generating graphs for new host

Post by peppermint »

I am using cacti-0.8.7b-2 and spine-0.8.7a-1, for whatever reason, it stopped polling/generating the graphs for the new hosts that being added. All I see in debug log are:

04/06/2008 12:27:21 AM - SPINE: Poller[0] Host[70] RECACHE: Processing 2 items in the auto reindex cache for 'xxxxxx'
04/06/2008 12:28:20 AM - SPINE: Poller[0] Host[70] RECACHE: Processing 2 items in the auto reindex cache for 'xxxxxx'

The hosts status in cacti are "alive".

Does anyone know what might be happening?

Thanks!

Pepper
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Either this is not full debug mode (--verbosity=5) or truncated log or something _really_evil
Reinhard
SexyBoy
Posts: 7
Joined: Fri Mar 28, 2008 11:31 am
Contact:

Hi

Post by SexyBoy »

I have a same problem. The cacti doesn't work for a new hosts. Actually it seas SNMP not in use. It is behaving like a target host has no enabled SNMP server, but it does!

THX
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

By further looking, I noticed this happens after a Data Templete/Graph Templete being added ( ipmi download from http://forums.cacti.net/about11593.html), as soon as I create the ipmi graph, it'll stop polling all the data source what so ever was for this host. All you see is that one line (in my previous post) in cacti.log regarding that host. (with DEBUG mode )

I deleted the ipmi templetes and created them manually with minimum stuff, cacti just doesn't like it. Although all the result can be retrieved from command line properly. Funny thing is I have a test machine, apply the change to that machine, it'll work (same version).

I have a feeling that somehow after ipmi graphs being added, cacti disabled the host and stopped polling, I have to remove the host and readd it back in order to get other graphs generated.

hmmm.... how can I troubleshoot further. :roll:

Pepper
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Thanks for the pointer, will have to investigate
Reinhard
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

Think I pinpointed what went wrong, i think it's a poller's issue. On my problem machine it's running spine, but on my test machine it's running cmd.php. I found the cause after I imported cacti dir as well as database from the problem machine to the test one. As a matter of fact, once I switch the poller from spine -> cmd.php on problem machine, all comes back to normal, it is able to generate the graph now

Soon after I switched the poller, following were logged in debug log:

04/11/2008 04:07:22 PM - CMDPHP: Poller[0] ASSERT: 'U<' failed. Recaching host 'dc003.xxx.com', data query #15
04/11/2008 04:06:20 PM - SPINE: Poller[0] Host[76] RECACHE: Processing 1 items in the auto reindex cache for 'dc003.xxx.com'

One possibility I think is it's probably failed at getting result from SNMPv3 (this is being used for the target host and community name is somewhat dummy as it is being used in my ipmi graphs), spine probably disabled the host and stops polling afterwards. This may apply to sexyboy's case as well.

Thanks!

PS: I will try the night spine see if it helps fix.

Pepper
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

Unfortunately doesn't seem that the nightly version fixed the problem I still see the same thing happening. I am switching to cmd.php at the time being....

Pepper
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Both of you, please provide me with complete output of

Code: Select all

spine --verbosity=5 <host> <host>
where <host> is the id of the failing one.
Reinhard
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

Here is the output of command:

#spine --verbosity 74 76
SPINE: Using spine config file [/etc/spine.conf]
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'path_webroot''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'path_cactilog''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The path_php_server variable is /var/www/cacti/script_server.php
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The path_cactilog variable is /var/www/cacti/log/cacti.log
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'log_destination''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The log_destination variable is 1 (FILE)
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'path_php_binary''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The path_php variable is /usr/bin/php
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'availability_method''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The availability_method variable is 2
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'ping_recovery_count''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The ping_recovery_count variable is 3
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'ping_failure_count''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The ping_failure_count variable is 2
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'ping_method''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The ping_method variable is 2
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'ping_retries''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The ping_retries variable is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'ping_timeout''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The ping_timeout variable is 400
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'log_perror''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The log_perror variable is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'log_pwarn''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The log_pwarn variable is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'log_pstats''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The log_pstats variable is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'max_threads''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The threads variable is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'poller_interval''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The polling interval is 60 seconds
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'concurrent_processes''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The number of concurrent processes is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'script_timeout''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The script timeout is 25
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'php_servers''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The number of php script servers to run is 1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT action FROM poller_item WHERE action=2 AND host_id BETWEEN 74 AND 76 LIMIT 1'
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: StartHost='74', EndHost='76', TotalPHPScripts='0'
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The PHP Script Server is Not Required
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT value FROM settings WHERE name = 'max_get_size''
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: The Maximum SNMP OID Get Size is 10
04/12/2008 10:09:29 AM - SPINE: Poller[0] Version 0.8.7 starting
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: MySQL is Thread Safe!
04/12/2008 10:09:29 AM - SPINE: Poller[0] SPINE: Initializing Net-SNMP API
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SNMP Header Version is 5.3.1
04/12/2008 10:09:29 AM - SPINE: Poller[0] DEBUG: SNMP Library Version is 5.3.1
04/12/2008 10:09:30 AM - SPINE: Poller[0] SPINE: Initializing PHP Script Server(s)
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT id FROM host WHERE disabled='' AND id BETWEEN 74 AND 76 ORDER BY id'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: Initial Value of Active Threads is 0
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 1
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT snmp_port, count(snmp_port) FROM poller_item WHERE host_id=0 AND rrd_next_step < 0 GROUP BY snmp_port'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT action, hostname, snmp_community, snmp_version, snmp_username, snmp_password, rrd_name, rrd_path, arg1, arg2, arg3, local_data_id, rrd_num, snmp_port, snmp_timeout, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context FROM poller_item WHERE host_id=0 and rrd_next_step <=0 ORDER by snmp_port'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_item SET rrd_next_step=rrd_next_step-60 WHERE host_id=0'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_item SET rrd_next_step=rrd_step-60 WHERE rrd_next_step < 0 and host_id=0'
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[0] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 0
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 1
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT id, hostname, snmp_community, snmp_version, snmp_username, snmp_password, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context, snmp_port, snmp_timeout, max_oids, availability_method, ping_method, ping_port, ping_timeout, ping_retries, status, status_event_count, status_fail_date, status_rec_date, status_last_error, min_time, max_time, cur_time, avg_time, total_polls, failed_polls, availability FROM host WHERE id=74'
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[74] No Host Availability Method Selected
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE host SET status='3', status_event_count='0', status_fail_date='0000-00-00 00:00:00', status_rec_date='0000-00-00 00:00:00', status_last_error='', min_time='0.000000', max_time='0.000000', cur_time='0.000000', avg_time='0.000000', total_polls='4077', failed_polls='0', availability='100.0000' WHERE id='74''
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT data_query_id, action, op, assert_value, arg1 FROM poller_reindex WHERE host_id=74'
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[74] RECACHE: Processing 1 items in the auto reindex cache for 'dc002.xxx.com'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_reindex SET assert_value='U' WHERE host_id='74' AND data_query_id='17' and arg1='.1.3.6.1.2.1.1.3.0''
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT snmp_port, count(snmp_port) FROM poller_item WHERE host_id=74 AND rrd_next_step < 0 GROUP BY snmp_port'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT action, hostname, snmp_community, snmp_version, snmp_username, snmp_password, rrd_name, rrd_path, arg1, arg2, arg3, local_data_id, rrd_num, snmp_port, snmp_timeout, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context FROM poller_item WHERE host_id=74 and rrd_next_step <=0 ORDER by snmp_port'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_item SET rrd_next_step=rrd_next_step-60 WHERE host_id=74'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_item SET rrd_next_step=rrd_step-60 WHERE rrd_next_step < 0 and host_id=74'
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[74] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 0
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 1
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT id, hostname, snmp_community, snmp_version, snmp_username, snmp_password, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context, snmp_port, snmp_timeout, max_oids, availability_method, ping_method, ping_port, ping_timeout, ping_retries, status, status_event_count, status_fail_date, status_rec_date, status_last_error, min_time, max_time, cur_time, avg_time, total_polls, failed_polls, availability FROM host WHERE id=76'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -1, errno was 111, total_time was 470.8767
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[76] PING: Result UDP: Host is Alive
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE host SET status='3', status_event_count='0', status_fail_date='0000-00-00 00:00:00', status_rec_date='0000-00-00 00:00:00', status_last_error='', min_time='0.393150', max_time='40.802000', cur_time='0.470880', avg_time='0.599154', total_polls='1957', failed_polls='0', availability='100.0000' WHERE id='76''
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT data_query_id, action, op, assert_value, arg1 FROM poller_reindex WHERE host_id=76'
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[76] RECACHE: Processing 1 items in the auto reindex cache for 'dc003.xxx.com'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_reindex SET assert_value='U' WHERE host_id='76' AND data_query_id='15' and arg1='.1.3.6.1.2.1.1.3.0''
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT snmp_port, count(snmp_port) FROM poller_item WHERE host_id=76 AND rrd_next_step < 0 GROUP BY snmp_port'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'SELECT action, hostname, snmp_community, snmp_version, snmp_username, snmp_password, rrd_name, rrd_path, arg1, arg2, arg3, local_data_id, rrd_num, snmp_port, snmp_timeout, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context FROM poller_item WHERE host_id=76 and rrd_next_step <=0 ORDER by snmp_port'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_item SET rrd_next_step=rrd_next_step-60 WHERE host_id=76'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'UPDATE poller_item SET rrd_next_step=rrd_step-60 WHERE rrd_next_step < 0 and host_id=76'
04/12/2008 10:09:30 AM - SPINE: Poller[0] Host[76] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 0
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'replace into settings (name,value) values ('date',NOW())'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: SQL:'insert into poller_time (poller_id, start_time, end_time) values (0, NOW(), NOW())'
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: Thread Cleanup Complete
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: PHP Script Server Pipes Closed
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: Allocated Variable Memory Freed
04/12/2008 10:09:30 AM - SPINE: Poller[0] DEBUG: MYSQL Free & Close Completed
04/12/2008 10:09:30 AM - SPINE: Poller[0] Time: 0.1880 s, Threads: 1, Hosts: 3
Thanks!

Pepper
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please perform this query manually

Code: Select all

SELECT action, hostname, snmp_community, snmp_version, snmp_username, snmp_password, rrd_name, rrd_path, arg1, arg2, arg3, local_data_id, rrd_num, snmp_port, snmp_timeout, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context FROM poller_item WHERE host_id=74 and rrd_next_step <=0 ORDER by snmp_port
Modify all security related data and post results
Reinhard
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

Here is the result:

Code: Select all

|      1 | dc002.xxx.coml | public         |            3 | cacti         | Nandemoii     | sensorReading | /var/www/cacti/rra/dc002_sensorreading_1669.rrd | /usr/bin/php -q /var/www/cacti/scripts/ipmi_sensors.php fan dc002xxx.com, <user>, <password>, get sensorReading 1  | NULL | NULL |          1669 |       1 |       161 |          500 | MD5                | <passphrase>            | DES                | NULL         |

|      1 | dc002.xxx.com | public         |            3 | cacti         | Nandemoii     | sensorReading | /var/www/cacti/rra/dc002_sensorreading_1670.rrd | /usr/bin/php -q /var/www/cacti/scripts/ipmi_sensors.php fan dc002xxx.com, <user>, <password>, get sensorReading 2  | NULL | NULL |          1670 |       1 |       161 |          500 | MD5                | <passphrase>            | DES                | NULL         |
NOTE: I had snmp_username/snmp_password/snmp_priv_passphrase fields modified.

Thanks very much!

Pepper
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

The commas are part of the Data Input Method command?
Reinhard
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

Yes it is.
peppermint
Cacti User
Posts: 58
Joined: Fri May 11, 2007 1:11 pm
Location: NY

Post by peppermint »

Not directly related with this topic, something about spine on 64bit enviornment.

I tried swithing back to spine due to the large number of polling data source, it not gets buffer overflow after querying from poller_reindex. If I use the latest spine compiled from the ngihtly source, it'll get:

04/14/2008 01:49:47 PM - SPINE: Poller[0] Host[2] DS[279] SNMP: v2: sw2xxx.com, dsname: traffic_in, oid: .1.3.6.1.2.1.31.1.1.1.6.432, value: U

for all snmp devices when snmpget retrieves the value properly. cmd.php runs fine.

Pepper
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

peppermint wrote:Yes it is.
Please delete those commas, then. I wonder what cli does with commas, but I suppose it won't work as expected
Reinhard
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests