custom indexed snmp data query not updating rrd

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

thebofh
Posts: 19
Joined: Mon Jan 24, 2005 7:16 pm

Post by thebofh »

The following is a quick hack to test my ideas... I applied it to my lib/poller.php library and it fixed my problem. All it does is test to see if there are existing unix_time in the $rrd_update_array within one second of the current unix_time. If one exists that key is used instead.

Code: Select all

$ diff tmp/cacti-0.8.6c/lib/poller.php htdocs/lib/poller.php
198a199,205
>                       /* Prefer an existing key if we already have entries key'd off a unix_time within one sec */
>                       if ( is_array($rrd_update_array{$item["rrd_path"]}) && array_key_exists("times",$rrd_update_array{$item["rrd_path"]}) && array_key_exists($unix_time-1,$rrd_update_array{$item["rrd_path"]}["times"]) ) {
>                               $unix_time--;
>                       } elseif (is_array($rrd_update_array{$item["rrd_path"]}) && array_key_exists("times",$rrd_update_array{$item["rrd_path"]}) && array_key_exists($unix_time+1,$rrd_update_array{$item["rrd_path"]}["times"])) {
>                               $unix_time++;
>                       }
>
237a245,250
>                       /* Prefer an existing key if we already have entries key'd off a unix_time within one sec */
>                       if ( is_array($rrd_update_array{$item["rrd_path"]}) && array_key_exists("times",$rrd_update_array{$item["rrd_path"]}) && array_key_exists($unix_time-1,$rrd_update_array{$item["rrd_path"]}["times"])) {
>                               $unix_time--;
>                       } elseif ( is_array($rrd_update_array{$item["rrd_path"]}) && array_key_exists("times",$rrd_update_array{$item["rrd_path"]}) && array_key_exists($unix_time+1,$rrd_update_array{$item["rrd_path"]}["times"])) {
>                               $unix_time++;
>                       }
I should also note that this might be the cause of gaps in some people's graphs (depending on the load of the cacti/db server). I also looked at v0.8.6d and poller.php didn't change all that much. I think the reason I'm seeing this more commonly is because my system is unusually slow (ancient Sun hardware) and I'm getting those timing problems more often (in fact almost every poll in this case).

I tried changing the mysql NOW() calls in cmd.php to a single php time() call but that didn't seem to fix things.
raX
Lead Developer
Posts: 2243
Joined: Sat Oct 13, 2001 7:00 pm
Location: Carlisle, PA
Contact:

Post by raX »

This is an interesting problem... it brings back memories ;-). Soon after the release of 0.8.6 we starting experiencing the exact problem that you have uncovered here. The poller would always would use NOW() when inserting into the 'poller_output' table, causing possible time discrepancies in process_poller_output().

The fix for us was to stop using NOW() for each insert and instead use a set timestamp that is retrieved when the poller starts. Since any given device is guaranteed to be localized to a single poller instance, all of the times within each host in the 'poller_output' table should be the same. If they are not, then there is a problem.

It looks from your posts that you are currently using cactid. The first thing that I would try is using cmd.php instead to see if it changes anything. I confirmed that the code mentioned above exists in both cmd.php and cactid, but there is something in cactid that looks odd to me. If switching to cmd.php fixes it, then we know where to look. If not, we need to figure out why there are data source items in the same host with different timestamps in the 'poller_output' table.

Thanks for all of your research into this matter!

-Ian
thebofh
Posts: 19
Joined: Mon Jan 24, 2005 7:16 pm

Post by thebofh »

It looks from your posts that you are currently using cactid.
Yup, I'm using cactid...
The fix for us was to stop using NOW() for each insert and instead use a set timestamp that is retrieved when the poller starts.
Any idea why that fix wasn't made official? Maybe there is a reason I'm overlooking. I think the only NOW() call that is necessary is the last one (where the poller's end time is recorded).
The first thing that I would try is using cmd.php instead to see if it changes anything.
Ahh... I forgot that making changes to cmd.php doesn't affect my cronjob which kicks off cactid. I'll remove the NOW() calls and try switching to cmd.php. Although I think things were much slower with it (thus the reason for switching to cactid in the first place). FYI, I'll be offline (woohoo!) for a few days but I'll look at this again ASAP.
Thanks for all of your research into this matter!
It's the least I could do for such an awesome OS project! It makes being a BOFH that much easier ;). Cacti for president!
thebofh
Posts: 19
Joined: Mon Jan 24, 2005 7:16 pm

Post by thebofh »

Here is a diff for cmd.php that seems to work even better, although cmd.php is far too slow (cmd.php takes 220secs vs cactid takes 12secs) for my purposes. Basically I just replaced all the mysql NOW() calls with a static php $now initialized at the beginning of the script.

Code: Select all

$ diff cmd-orig.php cmd.php
46a47,48
> $now = time();
>
207c209
<                                                       db_execute("insert into poller_command (poller_id,time,action,command) values (0,NOW()," . POLLER_COMMAND_REINDEX . ",'" . $item["host_id"] . ":" . $index_item["data_query_id"] . "')");
---
>                                                       db_execute("insert into poller_command (poller_id,time,action,command) values (0,$now," . POLLER_COMMAND_REINDEX . ",'" . $item["host_id"] . ":" . $index_item["data_query_id"] . "')");
211c213
<                                                       db_execute("insert into poller_command (poller_id,time,action,command) values (0,NOW()," . POLLER_COMMAND_REINDEX . ",'" . $item["host_id"] . ":" . $index_item["data_query_id"] . "')");
---
>                                                       db_execute("insert into poller_command (poller_id,time,action,command) values (0,$now," . POLLER_COMMAND_REINDEX . ",'" . $item["host_id"] . ":" . $index_item["data_query_id"] . "')");
215c217
<                                                       db_execute("insert into poller_command (poller_id,time,action,command) values (0,NOW()," . POLLER_COMMAND_REINDEX . ",'" . $item["host_id"] . ":" . $index_item["data_query_id"] . "')");
---
>                                                       db_execute("insert into poller_command (poller_id,time,action,command) values (0,$now," . POLLER_COMMAND_REINDEX . ",'" . $item["host_id"] . ":" . $index_item["data_query_id"] . "')");
345c347
< db_execute("insert into poller_time (poller_id, start_time, end_time) values (0, NOW(), NOW())");
---
> db_execute("insert into poller_time (poller_id, start_time, end_time) values (0, $now, NOW())");
$
Here is the patch I'm currently testing for cactid.. I'll report back when I know it works.

Code: Select all

$ diff poller.c poller-patched.c
222c222
<                                       snprintf(query3, 128, "insert into poller_command (poller_id,time,action,command) values (0,NOW(),%i,'%i:%i')", POLLER_COMMAND_REINDEX, host_id, reindex->data_query_id);
---
>                                       snprintf(query3, 128, "insert into poller_command (poller_id,time,action,command) values (0,%s,%i,'%i:%i')", start_datetime, POLLER_COMMAND_REINDEX, host_id, reindex->data_query_id);
232c232
<                                       snprintf(query3, 128, "insert into poller_command (poller_id,time,action,command) values (0,NOW(),%i,'%i:%i')", POLLER_COMMAND_REINDEX, host_id, reindex->data_query_id);
---
>                                       snprintf(query3, 128, "insert into poller_command (poller_id,time,action,command) values (0,%s,%i,'%i:%i')", start_datetime, POLLER_COMMAND_REINDEX, host_id, reindex->data_query_id);
242c242
<                                       snprintf(query3, 128, "insert into poller_command (poller_id,time,action,command) values (0,NOW(),%i,'%i:%i')", POLLER_COMMAND_REINDEX, host_id, reindex->data_query_id);
---
>                                       snprintf(query3, 128, "insert into poller_command (poller_id,time,action,command) values (0,%s,%i,'%i:%i')", start_datetime, POLLER_COMMAND_REINDEX, host_id, reindex->data_query_id);
$
Attachments
pre 13:15 was poller.php patch (seen in a previous post) using cactid... post 13:15 was the cmd.php patch (seen above) using cmd.php... so far so good...
pre 13:15 was poller.php patch (seen in a previous post) using cactid... post 13:15 was the cmd.php patch (seen above) using cmd.php... so far so good...
cmdphpfix.png (2.4 KiB) Viewed 1894 times
thebofh
Posts: 19
Joined: Mon Jan 24, 2005 7:16 pm

Post by thebofh »

raX wrote:we need to figure out why there are data source items in the same host with different timestamps in the 'poller_output' table.
I'm pretty sure that the fact that my monitoring host is overloaded doesn't help. We have old Sun hardware and the polling, web server, and DB are all on this host (along with the backup software, a web server on another port, Big Brother server, etc). I imagine that having so many mysql inserts (at a time when it's already bogged down handling snmpget's) there are likely to be some that have different NOW() values.
raX
Lead Developer
Posts: 2243
Joined: Sat Oct 13, 2001 7:00 pm
Location: Carlisle, PA
Contact:

Post by raX »

thebofh wrote:Any idea why that fix wasn't made official? Maybe there is a reason I'm overlooking. I think the only NOW() call that is necessary is the last one (where the poller's end time is recorded).
The fix that I was talking about was included in the 0.8.6b release. This bug occurred because the time used for inserting values into the 'poller_output' table was not being recorded.

In theory the fix that you proposed will not do anything regarding the problem that you are experiencing. The 'poller_command' table is used for things like data query re-caches and is completely separate from actual poller data. The 'poller_time' table is used to indicate to poller.php when a poller is finished. Since only one insert is generated per-poller, using NOW() is fine here.

Even if the bug is not in the poller itself, there is definitely something wrong for this to occur. You could probably rule out the poller by running cmd.php/cactid without poller.php and examining the values in 'poller_output'. This will show you all of the data that poller.php will process when it is run next.

-Ian
thebofh
Posts: 19
Joined: Mon Jan 24, 2005 7:16 pm

Post by thebofh »

I'm lame...:oops:

Code: Select all

$ strings cactid|grep 0.8.6
0.8.6a
$
I'll upgrade... I see the correct fix in the most recent cactid and cmd.php.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest