Cacti Consuming All Memory

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
jkukis
Posts: 29
Joined: Mon Sep 24, 2007 1:41 pm

Cacti Consuming All Memory

Post by jkukis »

I just installed Cacti as a way to move away from Cricket but am having some issues. I added around 50 devices for it to run graphs on interface traffic, cpu usage, and errors/discards. It seems to run fine at first, graphs start to get drawn, but after a 10 minutes or so it consumes memory on the system to the point where we have to reboot the server. When I started to see it slow down I commented out the poller crontab entry, and checked the 'disable poller' box on the webgui, but in the cacti.log file still had these entries showing up when I did a tail -f:

Code: Select all

09/21/2007 04:01:53 PM - CMDPHP: Poller[0] Host[31] DS[2435] WARNING: Result from SNMP not valid.  Partial Result: No Such Instance cur
09/21/2007 04:02:00 PM - CMDPHP: Poller[0] Host[49] NOTICE: HOST EVENT: Host Returned from DOWN State:
09/21/2007 04:02:05 PM - CMDPHP: Poller[0] Host[49] DS[4719] WARNING: Result from SNMP not valid.  Partial Result: No Such Instance cur
09/21/2007 04:02:07 PM - CMDPHP: Poller[0] Host[49] DS[4719] WARNING: Result from SNMP not valid.  Partial Result: No Such Instance cur
09/21/2007 04:02:07 PM - CMDPHP: Poller[0] Host[49] DS[4720] WARNING: Result from SNMP not valid.  Partial Result: No Such Instance cur
09/21/2007 04:02:08 PM - CMDPHP: Poller[0] Host[49] DS[4720] WARNING: Result from SNMP not valid.  Partial Result: No Such Instance cur
09/21/2007 04:02:38 PM - CMDPHP: Poller[0] Host[23] DS[1264] WARNING: Result from SNMP not valid.  Partial Result: No Such Instance cur
Also messages like this in the cacti.log:

Code: Select all

09/20/2007 10:41:35 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select   snmp_query_graph_rrd.snmp_field_name,   data_template_rrd.id as data_template_rr
d_id   from (snmp_query_graph_rrd,data_template_rrd)   where snmp_query_graph_rrd.data_template_rrd_id=data_template_rrd.local_data_template_rrd_id   and snm
p_query_graph_rrd.snmp_query_graph_id=   and snmp_query_graph_rrd.data_template_id=38   and data_template_rrd.local_data_id=609"
09/20/2007 10:41:35 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select   snmp_query_graph_rrd.snmp_field_name,   data_template_rrd.id as data_template_rr
d_id   from (snmp_query_graph_rrd,data_template_rrd)   where snmp_query_graph_rrd.data_template_rrd_id=data_template_rrd.local_data_template_rrd_id   and snm
p_query_graph_rrd.snmp_query_graph_id=   and snmp_query_graph_rrd.data_template_id=41   and data_template_rrd.local_data_id=610"
09/20/2007 10:41:35 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select   snmp_query_graph_rrd.snmp_field_name,   data_template_rrd.id as data_template_rr
d_id   from (snmp_query_graph_rrd,data_template_rrd)   where snmp_query_graph_rrd.data_template_rrd_id=data_template_rrd.local_data_template_rrd_id   and snm
p_query_graph_rrd.snmp_query_graph_id=   and snmp_query_graph_rrd.data_template_id=41   and data_template_rrd.local_data_id=611"
It has made the system hang 3 times, requiring a reboot each time, the last time it did it I was able to kill all the processes owned by cactiuser and saved the system from a reboot.

Any ideas on where to start troubleshooting this? If I could get this working it would be excellent, it is much easier to use then cricket ever was.

The server it is running on is running Fedora Core 5, and Cacti was installed using the Fedora 5 rpm. MySQL DB is running on another server, and there is data being sent to it successfully(I can see all the host data, etc)

Thanks,
Joe[/code]
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please make sure that cacti is fully patched
Reinhard
jkukis
Posts: 29
Joined: Mon Sep 24, 2007 1:41 pm

Post by jkukis »

I currently have 0.8.6j , is there a newer version, or where can I find any patches?
Thanks,
Joe
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

jkukis
Posts: 29
Joined: Mon Sep 24, 2007 1:41 pm

Post by jkukis »

OK,
I've installed all the patches listed, decided to move to cactid as well.

Last time this happened we noticed a lot of snmpget processes. In trying to search for an answer I came across this article that describes a similar type of thing. Does this make sense?

http://www.networkjack.info/blog/2007/0 ... snmp-bugs/

I have yet to try it yet and am trying to come up with a game plan to debug it. I was running the discover plugin to find hosts and it was working well, came up with around 340 hosts, created associated graphs for them, a few hours later the whole box went awol and unresponsive.

This was the last entry in the cacti log:

Code: Select all

10/02/2007 08:14:32 PM - SYSTEM STATS: Time:561.9951 Method:cactid Processes:1 Threads:1 Hosts:381 HostsPerProcess:381 DataSources:22337 RRDsProcessed:11645
I guess another question is should I try and increase the number of processes to handle the amount of rrds /datasources listed above? I realize that the 561 seconds is above the needed 300 second limit, but I'm not sure yet if that is due to the box going out of control or just cactid taking a long time.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

jkukis wrote:This was the last entry in the cacti log:

Code: Select all

10/02/2007 08:14:32 PM - SYSTEM STATS: Time:561.9951 Method:cactid Processes:1 Threads:1 Hosts:381 HostsPerProcess:381 DataSources:22337 RRDsProcessed:11645
I guess another question is should I try and increase the number of processes to handle the amount of rrds /datasources listed above? I realize that the 561 seconds is above the needed 300 second limit, but I'm not sure yet if that is due to the box going out of control or just cactid taking a long time.
That's bad. Should not exceed 300 sec by any means. Please use:
- 1-2 times the number of CPU Cores as a start for the "processes" value
- 10-15 as a start value for threads
Make sure to provide at least 64 MB for php memory, Depending on php version, up to 128MB or more are required.
Polling time largely depends on script usage. If >90% of data sources are SNMP, your quantities should not create any polling problem on an average 2 CPU XEON server with > 2GB RAM
Reinhard
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests