Cacti stops collecting at random hosts - SOLVED
Moderators: Developers, Moderators
-
- Posts: 14
- Joined: Fri Jun 23, 2006 10:46 am
Cacti stops collecting at random hosts - SOLVED
First of all, I'm sorry for my poor english..
I have Cacti up and running, monitoring about 950 hosts, all of them routers and switches. I'm using a dual Xeon 3.20GHz with 2GB RAM with:
Debian Sarge
Cacti 0.8.6g
Cactid 0.8.6f-1
MySQL: Ver 12.22 Distrib 4.0.24, for pc-linux-gnu (i386)
PHP 4.3.10-16
RRDtool 1.0.49
The problem is that, sometimes, the polling process finishes in a good time, about 55 seconds, but in other times it simply stops collecting data, and times out, with no errors.
This is a piece of my log file, when the thing happens:
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.2, value: 0
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.2, value: 1195300
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3916] SNMP: v1: 10.66.255.6, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, value: 444330804
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.2, value: 0
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.2, value: 0
06/23/2006 12:24:58 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
Every host that would be polled after 10.66.255.6 has a gap in his graph.
And the last host (in this case 10.66.255.6) is a random host, not always the same.
Any body has an idea about what is happening here?
I have Cacti up and running, monitoring about 950 hosts, all of them routers and switches. I'm using a dual Xeon 3.20GHz with 2GB RAM with:
Debian Sarge
Cacti 0.8.6g
Cactid 0.8.6f-1
MySQL: Ver 12.22 Distrib 4.0.24, for pc-linux-gnu (i386)
PHP 4.3.10-16
RRDtool 1.0.49
The problem is that, sometimes, the polling process finishes in a good time, about 55 seconds, but in other times it simply stops collecting data, and times out, with no errors.
This is a piece of my log file, when the thing happens:
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: discards_in, oid: .1.3.6.1.2.1.2.2.1.13.2, value: 0
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: errors_in, oid: .1.3.6.1.2.1.2.2.1.14.2, value: 1195300
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3916] SNMP: v1: 10.66.255.6, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, value: 444330804
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: discards_out, oid: .1.3.6.1.2.1.2.2.1.19.2, value: 0
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.2, value: 0
06/23/2006 12:24:58 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
Every host that would be polled after 10.66.255.6 has a gap in his graph.
And the last host (in this case 10.66.255.6) is a random host, not always the same.
Any body has an idea about what is happening here?
Last edited by santiagosoares on Wed Jul 12, 2006 8:26 am, edited 1 time in total.
- rony
- Developer/Forum Admin
- Posts: 6022
- Joined: Mon Nov 17, 2003 6:35 pm
- Location: Michigan, USA
- Contact:
I have been seeing this.
What version of net-snmp do you have installed? Are you using Cactid or cmd.php?
What version of net-snmp do you have installed? Are you using Cactid or cmd.php?
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
-
- Posts: 14
- Joined: Fri Jun 23, 2006 10:46 am
- rony
- Developer/Forum Admin
- Posts: 6022
- Joined: Mon Nov 17, 2003 6:35 pm
- Location: Michigan, USA
- Contact:
No clue yet... Durning the drop outs, does your poller get the 296 second timeout error?
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
-
- Posts: 14
- Joined: Fri Jun 23, 2006 10:46 am
Sorry, I didn't understand your question.
But i have gaps on the graphs when the poller times out.
The strange is, the poller goes until next to the end, and then it looks like freeze for almost 5 minutes, and then times out.
Look:
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.2, value: 0
06/23/2006 12:24:58 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
And sometimes it finishes the poll in about 55 seconds.
But i have gaps on the graphs when the poller times out.
The strange is, the poller goes until next to the end, and then it looks like freeze for almost 5 minutes, and then times out.
Look:
06/23/2006 12:20:42 PM - CACTID: Poller[0] Host[895] DS[3917] SNMP: v1: 10.66.255.6, dsname: errors_out, oid: .1.3.6.1.2.1.2.2.1.20.2, value: 0
06/23/2006 12:24:58 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
And sometimes it finishes the poll in about 55 seconds.
-
- Posts: 14
- Joined: Fri Jun 23, 2006 10:46 am
Manually running the poller I've got this exit:
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
.
.
.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
06/26/2006 02:10:42 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
Any idea???
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
OK u:0.12 s:0.20 r:41.46
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
.
.
.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
Waiting on 1/1 pollers.
06/26/2006 02:10:42 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
Any idea???
- rony
- Developer/Forum Admin
- Posts: 6022
- Joined: Mon Nov 17, 2003 6:35 pm
- Location: Michigan, USA
- Contact:
glib version on the box?
I am currently working on a box with simular issues, but it is running Centos 4.3. Everything at this point is telling me it's a mysql issue.
I am currently working on a box with simular issues, but it is running Centos 4.3. Everything at this point is telling me it's a mysql issue.
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
-
- Posts: 14
- Joined: Fri Jun 23, 2006 10:46 am
- rony
- Developer/Forum Admin
- Posts: 6022
- Joined: Mon Nov 17, 2003 6:35 pm
- Location: Michigan, USA
- Contact:
What version of GCC?
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
hi all, i've the same problem, cacti stops to collect data and graphs from some random hosts..
i've installed it on this linux gentoo box:
and cacti's version is:
how can fix ?
Thanks all
i've installed it on this linux gentoo box:
Code: Select all
Portage 2.1-r1 (default-linux/x86/2006.0, gcc-3.4.5, glibc-2.3.6-r3, 2.6.16-gentoo-r7 i686)
=================================================================
System uname: 2.6.16-gentoo-r7 i686 Intel(R) Celeron(R) CPU 2.80GHz
Gentoo Base System version 1.6.14
dev-lang/python: 2.4.3-r1
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache: [Not Present]
dev-util/confcache: [Not Present]
sys-apps/sandbox: 1.2.17
sys-devel/autoconf: 2.13, 2.59-r7
sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils: 2.16.1
sys-devel/gcc-config: 1.3.13-r2
sys-devel/libtool: 1.5.22
virtual/os-headers: 2.6.11-r2
Code: Select all
net-analyzer/cacti
Latest version available: 0.8.6h_p20060108-r2
Latest version installed: 0.8.6h_p20060108-r2
Size of files: 1,080 kB
Homepage: http://www.cacti.net/
Description: Cacti is a complete frontend to rrdtool
License: GPL-2
Thanks all
-
- Posts: 14
- Joined: Fri Jun 23, 2006 10:46 am
I've found something that I think can be interesting.
When I execute a during the polling, a get these lines about Cacti:
In the exact moment the poller stops collecting, i get this:
No cactid Process!!!
Is it normal, or is it a clue to help to solve my problem?
When I execute a
Code: Select all
#ps aux
Code: Select all
cacti 23846 0.0 0.0 2768 1216 ? Ss 17:40 0:00 /bin/sh -c php /usr/share/cacti/poller.php > /dev/null 2>&1
cacti 23847 4.9 0.4 17740 9220 ? S 17:40 0:00 php /usr/share/cacti/poller.php
cacti 23850 1.3 0.3 53404 7844 ? S 17:40 0:00 /usr/share/cactid/cactid 0 1346
cacti 23851 0.9 0.0 4208 1096 ? S 17:40 0:00 /usr/bin/rrdtool -
Code: Select all
cacti 23846 0.0 0.0 2768 1216 ? Ss 17:40 0:00 /bin/sh -c php /usr/share/cacti/poller.php > /dev/null 2>&1
cacti 23847 1.5 0.4 17740 9220 ? S 17:40 0:01 php /usr/share/cacti/poller.php
cacti 23851 0.2 0.0 4208 1096 ? S 17:40 0:00 /usr/bin/rrdtool -
Is it normal, or is it a clue to help to solve my problem?
- rony
- Developer/Forum Admin
- Posts: 6022
- Joined: Mon Nov 17, 2003 6:35 pm
- Location: Michigan, USA
- Contact:
Cactid is dieing... Do you have any "core" files laying around the the polling users home directory? Or in the cactid directory?
Also, how many hosts do you have? If you have more than 30, I would suggest at least 2 processes in the poller settings.
Also, how many hosts do you have? If you have more than 30, I would suggest at least 2 processes in the poller settings.
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
Who is online
Users browsing this forum: No registered users and 1 guest