RRD Files Stopped Updating for No Reason

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

axelilly
Posts: 27
Joined: Thu Aug 04, 2005 12:23 pm

Post by axelilly »

TheWitness wrote:Make sure PHP maximum memory is greater than 8mbytes. Then "truncate" the poller_output table.

TheWitness
I have my max_mem in php.ini set to 20MB

Also, I checked the poller_output table in the DB...it is already empty. Is this supposed to be empty?

mysql> select * from poller_output;
Empty set (0.00 sec)

mysql> truncate poller_output;
Query OK, 0 rows affected (0.00 sec)

I am also getting a lot of these entries in the log:

11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[371] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[372] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[372] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[373] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[373] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[374] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[374] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[375] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[375] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[376] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[376] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[377] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[377] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[378] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[378] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[379] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[379] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[380] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[380] WARNING: Result from SNMP not valid. Partial Result:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[381] WARNING: Result from SNMP not valid. Partial Result:


And they are showing up for many additional hosts, that all were working fine before.
Last edited by axelilly on Tue Nov 01, 2005 10:01 am, edited 1 time in total.
axelilly
Posts: 27
Joined: Thu Aug 04, 2005 12:23 pm

Wrong version of RRDTool?

Post by axelilly »

Am I using the wrong version of RRDTool on Linux?

Currently using: RRDtool 1.0.50
jrichardson
Cacti User
Posts: 66
Joined: Tue Mar 22, 2005 10:11 am

Post by jrichardson »

TheWitness wrote:This is likely caused by a memory problem for most users.
What is the "recommended" amount of memory to give to PHP as far as Cacti is recommended? I've not read anything saying I should increase it from the default 8M. The polling cycle does finish, it takes about 2 minutes using the PHP poller.
TheWitness wrote:When you say every X day's. How do you fix it? Please be more specific.
By "fixing it", I mean I log into the cacti mysql database and running:

Code: Select all

truncate table poller_output;
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Version 1.0.50 was causing problems. Please go back a version to 1.0.49.

Post results.

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
axelilly
Posts: 27
Joined: Thu Aug 04, 2005 12:23 pm

Post by axelilly »

Does the latest version of RRDTool work, 1.2?

Also, what version is Cacti currently bundled with?

Also, as a side note...does anyone know how to back out down to a lower version using yum or apt?
axelilly
Posts: 27
Joined: Thu Aug 04, 2005 12:23 pm

Post by axelilly »

This is completely broken. All repositories have 1.0.50 for RRDtool. That means that all systems that use auto updates will have their Cacti broken. This is simply bad software management. Shouldn't cacti be fixed so that it works with the current software version of the main tool it uses(rrdtool)? This is just amazing that the cacti group thinks this is just fine and not a big deal. Also, there is not one mention of this major bug on the cacti website. It is really looking as though cacti has fallen from being production ready and is now entering the realm of unreliablility.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

axelilly,

Just that some users report problems with RRDtool 1.0.50 does not mean that they are the fault of Cacti. Besides, that is not your problem. When looking at your post a few notches ago, you were getting response errors from SNMP as follows:
11/01/2005 09:34:37 AM - CMDPHP: Poller[0] Host[7] DS[371] WARNING: Result from SNMP not valid. Partial Result:
These have nothing to do with RRDtool. Sorry about the tangent. You need to run poller.php in MEDIUM for 1 polling cycle and then post the output. Also, you should recache you data queries.

I have uploaded a new file that if you run it from the command line, all of your data queries will be rebuilt. But before you run it, please make sure that is your problem by posting your MEDIUM output to the forum.

Place the file "poller_reindex_hosts.php" into your cacti directory and then run using "php poller_reindex_hosts.php".

TheWitness
Attachments
poller_reindex_hosts.zip
(1.11 KiB) Downloaded 436 times
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
eshine
Posts: 30
Joined: Fri Feb 25, 2005 11:38 am
Location: Sao Paulo - Brazil

Post by eshine »

Hi,

Just for your info...
I spent some time, but finally I've created other XML scripts to collect interfaces specific stats, linked then **correctly** and all devices are being polled correctly.

I still have some problems in the "view poller cache". However I feel that makes sense open a new thread for it.

Thanks for your support.
rgds,
Edgar Shine
tmoore
Posts: 9
Joined: Tue Nov 08, 2005 10:32 am
Location: Raleigh, NC

Post by tmoore »

OMG.. wait a sec.. we are supposed to be using rrdtool 1.0 and not 1.2 with the latest cacti?
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Both work fine.

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Paul Thexton
Posts: 49
Joined: Tue Jan 18, 2005 7:50 am

Post by Paul Thexton »

axelilly wrote: and is now entering the realm of unreliablility.
I wouldn't say that - we've been running cacti since around February of this year and apart from a few problems, most of which we've found to ways to recover from, we've had basically no problems at all (Apart from when I've had a blonde moment and been totally stupid).

The main problem I've had is the "graphs stop updating" problem which needs a truncation of the poller_output table, thing is we don't like losing data in our graphs, so I wrote a little perl script to grab all of that data, the associated ds names and update the rrd file myself - it's not perfect but it's helped us retrieve a lot of data (for example when it has stopped updating at midnight and nobody comes in to the office until 9am). I have updated the memory limits in php this morning now I've finally found a post that explains why this happens.

The Witness: Would it be possible, or is it being considered, for cacti to supply a php script that can be run to correctly process all of the data currently stored in the poller_output table (and then truncate it) in cases such as this? I'm happy to use my own perl method but it would probably help a lot of people out should they come a cropper of this particular annoyance.
ijg0
Posts: 8
Joined: Mon Oct 31, 2005 11:27 am

Post by ijg0 »

Paul Thexton wrote: and is now entering the realm of unreliablility.
so I wrote a little perl script to grab all of that data, the associated ds names and update the rrd file myself - it's not perfect but it's helped us retrieve a lot of data (for example when it has stopped updating at midnight and nobody comes in to the office until 9am).
Would it be possible for you to share you perl script?

We had the exact same problem over the weekend. I have a backup of the data before I truncated the poller_output table and would like to put this data back into the rrd graphs.
jrichardson
Cacti User
Posts: 66
Joined: Tue Mar 22, 2005 10:11 am

Problem cropped up again

Post by jrichardson »

Hi All, recently we saw another issue where the cacti graphs were not being updated. This was a serious issue for my organization last fall. As of late november; however we upgraded to 0.8.6g and thought the problem was fixed. However last week we suffered the same type of cacti outage.

Before 0.8.6g the "fix" for our issue was to "truncate table poller_output;" in the database. I had not have to do that again until last week. Last week, there were about 3.7 million entries in poller_output (determined by doing a SELECT COUNT(*) FROM poller_output). I did the "truncate" method and cacti started working again.

I just checked again (one week later) and the poller_output table contains nothing.

I'm curious:

1.) What is the poller_output table used for?
2.) Is something failing and not cleaning up the table when it is done?
3.) Was this something fixed between 0.8.6c and 0.8.6g that still might need a little work?
4.) What impact will having 3.7 million records in the poller_output table have?
5.) Are there any recommended ways of predicting this failure?


Thanks,


James T. Richardson, Jr.
raX
Lead Developer
Posts: 2243
Joined: Sat Oct 13, 2001 7:00 pm
Location: Carlisle, PA
Contact:

Re: Problem cropped up again

Post by raX »

jrichardson wrote:1.) What is the poller_output table used for?
This is where the poller (cmd.php or cactid) stores its results before being processed by poller.php, which reads from poller_output and writes to the .rrd files. This enables you to have Cacti and the poller on two seperate machines. It also enables the use of multiple pollers which is coming in the near future.
jrichardson wrote:2.) Is something failing and not cleaning up the table when it is done?
It seems that way. The last thing that poller.php does before it exits is truncate the poller_output table. I have a feeling that something is causing the poller not to finish and as a result the poller_output table is not getting truncated. After a few runs of this happening, poller.php quickly gets behind and is not able to properly update graphs.
jrichardson wrote:3.) Was this something fixed between 0.8.6c and 0.8.6g that still might need a little work?
Not directly. The behavior that I explained above has not changed since 0.8.6 to my knowledge. Perhaps something else changed that reduced the likelyhood of the poller not finishing properly.
jrichardson wrote:4.) What impact will having 3.7 million records in the poller_output table have?
Well poller.php attempts to select all of the rows out of this table in one shot. Even if the MySQL server manages to return the result set, it would probably cause PHP to run out of memory when it attempts to save the result set to an array.
jrichardson wrote:5.) Are there any recommended ways of predicting this failure?
Answering these questions provided me with two things that might help. Depending on the environment, it might make sense to force poller.php to truncate the poller_output table if it exceeds the 300 second time limit. Also for extreme situations where the poller_output table does get too big, placing a LIMIT on the select that pulls data from this table would make sense.

If you want something to test, trying throwing a "db_execute("truncate table poller_output");" statement inside of the "if" block starting on line 208 in poller.php.

-Ian
Paul Thexton
Posts: 49
Joined: Tue Jan 18, 2005 7:50 am

Post by Paul Thexton »

ijg0 wrote:
Paul Thexton wrote: and is now entering the realm of unreliablility.
so I wrote a little perl script to grab all of that data, the associated ds names and update the rrd file myself - it's not perfect but it's helped us retrieve a lot of data (for example when it has stopped updating at midnight and nobody comes in to the office until 9am).
Would it be possible for you to share you perl script?

We had the exact same problem over the weekend. I have a backup of the data before I truncated the poller_output table and would like to put this data back into the rrd graphs.
Apologies, I have not checked this forum for quite some time due to working on another project and in the main having little to no trouble with cacti.

The script is as follows - it is admittedly pretty flakey, hence why I asked TheWitness if he was willing to write something perhaps a little more intelligent... but if you still have a requirement for it, try this

Code: Select all

#!/usr/bin/perl

use DBI;

my $dbh = DBI->connect('DBI:mysql:cacti','cactiuser','cactiuser');

my %dsources;

%dsources = ();

if($ARGV[0] eq "")
{
  print "you must enter a mysql datestamp argument: YYYY-MM-DD HH:MM:SS\n";
  exit(0);
}

my $sql = "select distinct(po.local_data_id),dtd.data_source_path,unix_timestamp(po.time) as outime from poller_output po left join data_template_data dtd on dtd.local_data_id=po.local_data_id where time <= "$ARGV[0]" order by outime asc";

my $query = $dbh->prepare($sql);

$query->execute;

$count=0;

while(@record = $query->fetchrow)
{
  $r0 = $record[0];
  $r1 = $record[1];
  $r2 = $record[2];
  $r1 =~ s/<path_rra>/\/var\/www\/html\/cacti\/rra/g;
  my $record = "";
  $record = "$r0#$r1#$r2";
  $dsources[$count] = "$record";
  $count++;
}

#foreach my $source (keys %sources) {
#  print "$source - $sources[$source]\n";
#}

$count2 = 0;

while($count2 < $count)
{
  $line = $dsources[$count2];
  ($d_id,$d_path,$d_tstamp) = split ( '#', $line);
  #print "$d_id, $d_path, $d_tstamp - $line\n";

  $sql = "select rrd_name,output from poller_output where local_data_id=$d_id and unix_timestamp(time) = $d_tstamp";
  
  $template_command = "--template ";
  $update_command = "$d_tstamp";
  $rows = 0;

  my $query = $dbh->prepare($sql);
  $query->execute;
  while(@record = $query->fetchrow)
  {
    if($rows==0)
    {
      $template_command .= "$record[0]";
    }
    else
    {
      $template_command .= ":$record[0]";
    }
    $update_command .= ":$record[1]";
    $rows++;
  }

  $full_command = "/usr/local/rrdtool-1.2.11/bin/rrdtool update $d_path $template_command $update_command";

  system($full_command);
  #print "$full_command\n";

  $count2++;
}

my $sql = "delete from poller_output where time <= "$ARGV[0]"";
my $query = $dbh->prepare($sql);
$query->execute;
I wrote this to take a mysql timestamp field as an argument (encapsulated with quotes to accomodate for the space required!) so it would only process up to an including the specified time ... I cannot actually remember why I did this now, so you could probably get away with editing that part out and just get it to select all data from poller_output

Just one quick note - it is best to execute this before truncating poller_output, otherwise the cacti normally poller may recover and update the rrd files with a more recent timestamp - at that point rrdtool will refuse to accept the data you are feeding in as it is "in the past"
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests