Cacti seems to have lost contact with nodes....

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
darthravenell
Posts: 7
Joined: Wed Mar 14, 2012 12:57 pm

Cacti seems to have lost contact with nodes....

Post by darthravenell »

Hello,

We have been running Cacti for over a year now with only small issues. We have several hundred nodes configured for monitoring and we are not finished adding all the nodes we wish Cacti to monitor. What seems like all of a sudden Cacti seemed to loose contact with our nodes which makes it think they are down hard and so it is flooding our email boxes with literally thousands of emails. We had to turn off the machine that we have Cacti running on before it caused more issues on our network. We have tried to get into Cacti with the device it is on being disconnected from the network to see if we could turn off the emailing function but we are unsuccessful from command line. We are running Cacti on a virtual Unix box if this helps any.

Is there a way we can get to this functionality and turn it off through the command line or is there a way to get to the Cacti GUI with our device being offline?

Thanks
User avatar
Linegod
Developer
Posts: 1626
Joined: Thu Feb 20, 2003 10:16 am
Location: Canada
Contact:

Re: Cacti seems to have lost contact with nodes....

Post by Linegod »

Just disable the cronjob that is running the poller, then try to figure out what is going on.
--
Live fast, die young
You're sucking up my bandwidth.

J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
darthravenell
Posts: 7
Joined: Wed Mar 14, 2012 12:57 pm

Re: Cacti seems to have lost contact with nodes....

Post by darthravenell »

We figured out that Cacti was using the email client on the machine it was installed on. We just disabled it and reconnected it to the network. This allowed me to get into the GUI and disable the notifications.

Now I have discovered a weird issue. I can go to Devices and can verify that Cacti is polling the routers via Ping and SNMP and (at least here) it is saying it sees a good connection.

However, once I click and go to the Monitor tab at the top all my routers show Down. Also none of the graphs are displaying any info. Finally, when I got to New Graphs and try to create a new graph, where it would display all my device's interfaces, all that appears there is:

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809

Again, this happened all of a sudden. It had been working fine for some time. If troubleshooting this issue requires more information (i.e. debugging capture) please let me know. I know how frustrating it is working with n00bs but the company I work for is pretty large and we have been pretty satisfied with Cacti so far and are big fans of the tool. Any assistance is most greatly appreciated!
darthravenell
Posts: 7
Joined: Wed Mar 14, 2012 12:57 pm

Re: Cacti seems to have lost contact with nodes....

Post by darthravenell »

Here is a snip-it of the log file of things shown in red:

08/20/2012 09:22:29 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:28 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:27 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:26 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:25 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:24 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:23 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:22 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:21 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"

08/20/2012 09:22:19 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:19 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
darthravenell
Posts: 7
Joined: Wed Mar 14, 2012 12:57 pm

Re: Cacti seems to have lost contact with nodes....

Post by darthravenell »

One more thing, here is a debug of the Graph Management. What you see here is what is displayed for every device:

RRDTool Command:

/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-60 \
--title="rtblmnaar01::Bloomington, Illinois - Traffic - Mu1" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="bits per second" \
--slope-mode \
--font TITLE:10: \
--font AXIS:8: \
--font LEGEND:8: \
--font UNIT:8: \
DEF:a="/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd":traffic_in:AVERAGE \
DEF:b="/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd":traffic_out:AVERAGE \
CDEF:cdefa=a,8,* \
CDEF:cdefe=b,8,* \
AREA:cdefa#00CF00FF:"Inbound" \
GPRINT:cdefa:LAST:" Current\:%8.2lf %s" \
GPRINT:cdefa:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefa:MAX:"Maximum\:%8.2lf %s\n" \
LINE1:cdefe#002A97FF:"Outbound" \
GPRINT:cdefe:LAST:"Current\:%8.2lf %s" \
GPRINT:cdefe:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefe:MAX:"Maximum\:%8.2lf %s"

RRDTool Says:

ERROR: opening '/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd': No such file or directory
User avatar
Linegod
Developer
Posts: 1626
Joined: Thu Feb 20, 2003 10:16 am
Location: Canada
Contact:

Re: Cacti seems to have lost contact with nodes....

Post by Linegod »

Fix this first:

SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired'

Then check this:

ERROR: opening '/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd': No such file or directory
--
Live fast, die young
You're sucking up my bandwidth.

J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
darthravenell
Posts: 7
Joined: Wed Mar 14, 2012 12:57 pm

Re: Cacti seems to have lost contact with nodes....

Post by darthravenell »

Thanks Linegod. One question, for the: SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired'
is fixing this done by running repair_database script?
User avatar
Linegod
Developer
Posts: 1626
Joined: Thu Feb 20, 2003 10:16 am
Location: Canada
Contact:

Re: Cacti seems to have lost contact with nodes....

Post by Linegod »

--
Live fast, die young
You're sucking up my bandwidth.

J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
Post Reply

Who is online

Users browsing this forum: No registered users and 7 guests