How can I resolve the message "Waiting on 1 of 1 pollers."?

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

How can I resolve the message "Waiting on 1 of 1 pollers."?

Post by oliversalzburg »

I installed Cacti on a fresh Debian server yesterday (from the official APT repository). I configured a few more data sources for the local system and created graphs for it. Sadly, no values are being graphed.

So I started debugging and quickly spotted this in my cacti.log:

Code: Select all

POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
I'm manually running poller.php, which looks great from what I can tell. But at some point it will just loop this message over and over:

Code: Select all

Waiting on 1 of 1 pollers.
So, obviously some poller never finishes, but which one? I was not able to confirm this, but instead went into poller.php to add additional debug output.
At line 368, I spotted where the poller checks if other pollers still run, so I added my debug here

Code: Select all

                $rrds_processed = 0;
                while (1) {
                        $finished_processes = db_fetch_cell("SELECT count(*) FROM poller_time WHERE poller_id=0 AND end_time>'0000-00-00 00:00:00'");
                        print "Finished: " . $finished_processes . " - Started: " . $started_processes . "\n";

                        if ($finished_processes >= $started_processes) {
                                $rrds_processed = $rrds_processed + process_poller_output($rrdtool_pipe, TRUE);

                                log_cacti_stats($loop_start, $method, $concurrent_processes, $max_threads,
When I now start the poller manually, it outputs the following pretty early:

Code: Select all

Finished: 0 - Started: 1
Waiting on 1 of 1 pollers.
Finished: 1 - Started: 1
Great! We know it works.

But when it starts looping the message, the output changes:

Code: Select all

Finished:  - Started: 1
Waiting on 1 of 1 pollers.
Now $finished_processes is empty? That makes no sense. I added a var_dump() call, and sure, enough

Code: Select all

NULL
Finished:  - Started: 1
Waiting on 1 of 1 pollers.
How can that call return NULL? When I execute the same SQL query on another client while that loop runs, it returns a valid value (usually 2).

So, what's going on? Is this a known issue with the outdated Debian package?
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by oliversalzburg »

After looking further, the source seems to be within db_fetch_cell

Code: Select all


        if (($query) || ($db_conn->ErrorNo() == 1032)) {
                if (!$query->EOF) {
                        if ($col_name != '') {
                                $column = $query->fields[$col_name];
                        }else{
                                $column = $query->fields[0];
                        }

                        $query->close();

                        return($column);
                }
        }else if (($db_conn->ErrorNo() == 1049) || ($db_conn->ErrorNo() == 1051)) {
The path it takes when the error occurs is the (non-existent) else branch of "if (!$query->EOF) {".

So, why does it not get any results for that query even though the query seems to work just fine when I execute it?
jmnielsen
Posts: 9
Joined: Mon Mar 19, 2012 3:09 pm

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by jmnielsen »

See if this thread helps at all: http://forums.cacti.net/about22488.html. Most of these problems (from what I can tell on Google searches) stem from incorrect version matches between Cacti (the letter at the end of the version matters) and the poller. It may also matter whether you are using the standard poller or SPINE.

As suggested in that thread the Cacti debugging page may come in handy (which mentions poller debugging): http://docs.cacti.net/manual:087:4_help ... #debugging.

It is possible something is wrong with your debian package version, in which case you will have to install Cacti from source (not that difficult). See here for that: http://www.installationwiki.org/Install ... stallation. I hope that helps somewhat.
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by oliversalzburg »

Well, something is definitely broken here. I tried to execute several different SQL queries against the database (at the location where I determined the failure), and they always return empty result sets. Which makes no sense at all.

Some research into the "Maximum runtime exceeded" message suggested that it is actually a timeout and could be resolved by either increasing the threshold or switching to spine.

I have only like 20 data sources, all on the localhost. It is impossible that these runs takes 300 seconds. In fact, I'm pretty sure the whole thing completes within a second, so this is not the issue.
Nevertheless, I still tried to increase the timeout (which did not resolve the issue) and I tried switching to spine (which also didn't resolve the issue).

Debugging the poller yields no results. The mentioned debugging approaches usually revolve around manually performing the steps Cacti would take to see where it fails.
I did that and I found out where it fails. Which is the reason for this thread.

I also tried installing the latest version from source. For that I created a copy of my database, supplied that to my new installation and upgraded it. I still didn't get any data graphed, the behavior observed in the log files and the poller was slightly different. But I concluded that the differences were caused by other fixes/optimizations unrelated to my issue.

For the time being, I switched back to the original, Debian-supplied package
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by oliversalzburg »

I can't help but feel like there is an issue with the database abstraction (at least that's what it looks like right now).

To determine why db_fetch_cell would not return the correct number of finished processes, I extended poller.php further:

Code: Select all


                while (1) {
                        $finished_processes = db_fetch_cell("SELECT count(*) FROM poller_time WHERE poller_id=0 AND end_time>'0000-00-00 00:00:00'");
                        print "Finished: " . $finished_processes . " - Started: " . $started_processes . "\n";

                        $mysqli = new mysqli("localhost","cacti","cacti","cacti");
                        $result = $mysqli->query("SELECT COUNT(*) FROM poller_time WHERE poller_id=0 AND end_time>'0000-00-00 00:00:00';");
                        $row = $result->fetch_assoc();
                        var_dump( $row );

                        if ($finished_processes >= $started_processes) {
                                $rrds_processed = $rrds_processed + process_poller_output($rrdtool_pipe, TRUE);

                                log_cacti_stats($loop_start, $method, $concurrent_processes, $max_threads,
                                        sizeof($polling_hosts), $hosts_per_process, $num_polling_items, $rrds_processed);

                                break;
                        }else {
                                if (read_config_option("log_verbosity") >= POLLER_VERBOSITY_MEDIUM) {
                                        print "Waiting on " . ($started_processes - $finished_processes) . " of " . $started_processes . " pollers.\n";
                                }

                                $rrds_processed = $rrds_processed + process_poller_output($rrdtool_pipe);

                                /* end the process if the runtime exceeds MAX_POLLER_RUNTIME */
                                if (($poller_start + MAX_POLLER_RUNTIME) < time()) {
                                        cacti_log("Maximum runtime of " . MAX_POLLER_RUNTIME . " seconds exceeded. Exiting.", true, "POLLER");

                                        log_cacti_stats($loop_start, $method, $concurrent_processes, $max_threads,
                                                sizeof($polling_hosts), $hosts_per_process, $num_polling_items, $rrds_processed);

                                        break;
                                }else{
                                        sleep(1);
                                }
                        }
                }
As you can see, I'm now creating my own MySQLi connection to the database and run the exact same query. This is the result:

Code: Select all

Finished:  - Started: 1
array(1) {
  ["COUNT(*)"]=>
  string(1) "2"
}
Waiting on 1 of 1 pollers.
So, the values are in the database. Looking further...
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by oliversalzburg »

So, I enabled logging in mysqld as it seemed logical to see what queries are actually being performed. And, to my surprise, there aren't any queries being performed when the error happens.
This is what I see at the end of my log:

Code: Select all


                  485 Query     select value from `cacti`.`settings` where name='log_verbosity'
                  485 Query     SELECT * FROM host     WHERE (disabled = ''     AND id >= 0     AND id <= 0)     ORDER by id
                  485 Query     SELECT *     FROM poller_item     WHERE (host_id >= 0     AND host_id <= 0     AND rrd_next_step <= 0)     ORDER by host_id
                  485 Query     select value from `cacti`.`settings` where name='log_destination'
                  485 Query     select value from `cacti`.`settings` where name='path_cactilog'
                  485 Query     SELECT count(*)     FROM poller_item     WHERE (action=2     AND host_id >= 0     AND host_id <= 0     AND rrd_next_step <= 0)
                  485 Query     UPDATE poller_item     SET rrd_next_step = rrd_next_step - 60     WHERE (host_id >= 0     AND host_id <= 0)
                  485 Query     UPDATE poller_item     SET rrd_next_step = rrd_step - 60     WHERE (rrd_next_step < 0     AND host_id >= 0     AND host_id <= 0)
                  485 Query     INSERT INTO poller_time (poller_id, pid, start_time, end_time) VALUES (0, 3041, NOW(), '0000-00-00 00:00:00')
                  485 Query     UPDATE poller_time SET end_time=NOW() WHERE pid=3041
                  485 Quit
120629 20:04:52   482 Quit
When I still had my own debugging code in there, I would see 1 connection being made and 1 query being sent (my own debugging code).
Right now, Cacti is simply not querying the database at all.
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by oliversalzburg »

I have also opened a question about this issue on ServerFault: http://serverfault.com/questions/403566 ... ses/403736
The "analysis" might be better documented over there.

When I find the source of the issue, I'll post it.
oliversalzburg
Posts: 8
Joined: Fri Jun 29, 2012 7:25 am

Re: How can I resolve the message "Waiting on 1 of 1 pollers

Post by oliversalzburg »

I worked around the issue by disabling the database disconnect/reconnect in poller.php.
This is fine for my installation and fully resolves the issue. I still don't understand the underlying cause, even though initial debugging was pointing to an issue with db_connect_real. When it tried to re-connect after the first poller run, the connection id would not be valid, even though the connection attempt succeeded.

Code: Select all

if ($poller_runs_completed < $poller_runs) {
    //db_close();
    // Debug message by myself
    echo "RECONNECTING IN " . $sleep_time . "\n";
    usleep($sleep_time * 1000000);
    //db_connect_real($database_hostname, $database_username, $database_password, $database_default, $database_type, $database_port);
}
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests