rras not updating following cacti 0.8.6g update

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
snotblower
Posts: 10
Joined: Fri Sep 02, 2005 5:49 pm

rras not updating following cacti 0.8.6g update

Post by snotblower »

OS: FC3
cacti ver: 0.8.6g

symptoms: following update to 0.8.6g some RRAs do not update. However the logs clearly show that they did update. This is not *all* RRAs, just some of them. To make it even more confusing, sometimes the ones in question actually *do* update.

I have already added the official patches.

For example, let me pick one rra...

you can see that this has not been updated in a while. (It is 13:45 and it was last updated at 09:36).

sh-3.00$ date
Tue Sep 27 13:45:41 CDT 2005
sh-3.00$ ls -al sprint_pleasanton_avg_1116.rrd
-rw-r--r-- 1 cacti cacti 188308 Sep 27 09:36 sprint_pleasanton_avg_1116.rrd

However, if you look in the logs it shows it was updated at 13:45:

09/27/2005 01:45:44 PM - POLLER: Poller[0] CACTI2RRD: /usr/bin/rrdtool update /data1/www/cacti/rra/sprint_pleasanton_avg_1116.rrd --template min:max:avg:loss 1127846703:52.253:66.340:59.383:0


...but it you look at the actual rra, again it was updated at 09:35. (localtime of 1127831711 is 2005/09/27 09:35:11.000).

sh-3.00$ rrdtool info /data1/www/cacti/rra/sprint_pleasanton_avg_1116.rrd | morefilename = "/data1/www/cacti/rra/sprint_pleasanton_avg_1116.rrd"
rrd_version = "0001"
step = 300
last_update = 1127831711
ds[avg].type = "GAUGE"
ds[avg].minimal_heartbeat = 600
ds[avg].min = 0.0000000000e+00
ds[avg].max = 2.0000000000e+03
ds[avg].last_ds = "UNKN"
ds[avg].value = 5.7258300000e+02
ds[avg].unknown_sec = 0
<truncated output for brevity>

I have noticed if I run the poller by hand I get the following to stderr:
snmp error Bad file descriptor
snmp error Bad file descriptor
snmp error Bad file descriptor

This makes me wonder if I am out of file descriptors. Is it possible that something in 0.8.6g is not closing its files???

the poller doesnt seem to catch that the rras are not getting updated. The logs show normal completion:

09/27/2005 01:48:30 PM - SYSTEM STATS: Time:208.3005 Method:cmd.php Processes:255
Threads:N/A Hosts:169 HostsPerProcess:7 DataSources:2522 RRDsProcessed:1491
User avatar
rony
Developer/Forum Admin
Posts: 6022
Joined: Mon Nov 17, 2003 6:35 pm
Location: Michigan, USA
Contact:

Post by rony »

Have you applied all patches from the website?

Yes, there are patches already... :(
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
snotblower
Posts: 10
Joined: Fri Sep 02, 2005 5:49 pm

Post by snotblower »

I have already added the official patches.
User avatar
rony
Developer/Forum Admin
Posts: 6022
Joined: Mon Nov 17, 2003 6:35 pm
Location: Michigan, USA
Contact:

Post by rony »

Typically, which one are failing to update, Disk, interface, etc, what type of information are they gathering?

Are these hosts in question on slower connections? Would increasing the timeout help resolve these issues?
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
snotblower
Posts: 10
Joined: Fri Sep 02, 2005 5:49 pm

Post by snotblower »

I think it was the timeout issue (I found it on another thread before I saw this reply).

Is it me or does it seem like with this release the timeout is not for a datasource/script but is instead for the entire concurrent poller process?

The problem didnt seem to occur for one particuar type of DS... snmp, script, etc.

I do have a handful of my own hacks that take upwards of a minute to run. It seems like if other DS's get 'behind' them and the long running poller gets squashed then the things behind it get squashed.

It is also odd the logs show it being updated when it wasnt...

Now if I can just figure out which one of my scripts is turning to a zombie I think I will be good to go.
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests