Cacti seems to have lost contact with nodes....
Moderators: Developers, Moderators
-
- Posts: 7
- Joined: Wed Mar 14, 2012 12:57 pm
Cacti seems to have lost contact with nodes....
Hello,
We have been running Cacti for over a year now with only small issues. We have several hundred nodes configured for monitoring and we are not finished adding all the nodes we wish Cacti to monitor. What seems like all of a sudden Cacti seemed to loose contact with our nodes which makes it think they are down hard and so it is flooding our email boxes with literally thousands of emails. We had to turn off the machine that we have Cacti running on before it caused more issues on our network. We have tried to get into Cacti with the device it is on being disconnected from the network to see if we could turn off the emailing function but we are unsuccessful from command line. We are running Cacti on a virtual Unix box if this helps any.
Is there a way we can get to this functionality and turn it off through the command line or is there a way to get to the Cacti GUI with our device being offline?
Thanks
We have been running Cacti for over a year now with only small issues. We have several hundred nodes configured for monitoring and we are not finished adding all the nodes we wish Cacti to monitor. What seems like all of a sudden Cacti seemed to loose contact with our nodes which makes it think they are down hard and so it is flooding our email boxes with literally thousands of emails. We had to turn off the machine that we have Cacti running on before it caused more issues on our network. We have tried to get into Cacti with the device it is on being disconnected from the network to see if we could turn off the emailing function but we are unsuccessful from command line. We are running Cacti on a virtual Unix box if this helps any.
Is there a way we can get to this functionality and turn it off through the command line or is there a way to get to the Cacti GUI with our device being offline?
Thanks
Re: Cacti seems to have lost contact with nodes....
Just disable the cronjob that is running the poller, then try to figure out what is going on.
--
Live fast, die young
You're sucking up my bandwidth.
J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
Live fast, die young
You're sucking up my bandwidth.
J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
-
- Posts: 7
- Joined: Wed Mar 14, 2012 12:57 pm
Re: Cacti seems to have lost contact with nodes....
We figured out that Cacti was using the email client on the machine it was installed on. We just disabled it and reconnected it to the network. This allowed me to get into the GUI and disable the notifications.
Now I have discovered a weird issue. I can go to Devices and can verify that Cacti is polling the routers via Ping and SNMP and (at least here) it is saying it sees a good connection.
However, once I click and go to the Monitor tab at the top all my routers show Down. Also none of the graphs are displaying any info. Finally, when I got to New Graphs and try to create a new graph, where it would display all my device's interfaces, all that appears there is:
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Again, this happened all of a sudden. It had been working fine for some time. If troubleshooting this issue requires more information (i.e. debugging capture) please let me know. I know how frustrating it is working with n00bs but the company I work for is pretty large and we have been pretty satisfied with Cacti so far and are big fans of the tool. Any assistance is most greatly appreciated!
Now I have discovered a weird issue. I can go to Devices and can verify that Cacti is polling the routers via Ping and SNMP and (at least here) it is saying it sees a good connection.
However, once I click and go to the Monitor tab at the top all my routers show Down. Also none of the graphs are displaying any info. Finally, when I got to New Graphs and try to create a new graph, where it would display all my device's interfaces, all that appears there is:
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Warning: Invalid argument supplied for foreach() in /var/www/html/graphs_new.php on line 809
Again, this happened all of a sudden. It had been working fine for some time. If troubleshooting this issue requires more information (i.e. debugging capture) please let me know. I know how frustrating it is working with n00bs but the company I work for is pretty large and we have been pretty satisfied with Cacti so far and are big fans of the tool. Any assistance is most greatly appreciated!
-
- Posts: 7
- Joined: Wed Mar 14, 2012 12:57 pm
Re: Cacti seems to have lost contact with nodes....
Here is a snip-it of the log file of things shown in red:
08/20/2012 09:22:29 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:28 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:27 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:26 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:25 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:24 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:23 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:22 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:21 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:19 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:19 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:29 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:28 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:27 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:26 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:25 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:24 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:23 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:22 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:21 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:19 AM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed!, Error:'145', SQL:"select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
08/20/2012 09:22:19 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
08/20/2012 09:22:18 AM - SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired' (Spine parent)
-
- Posts: 7
- Joined: Wed Mar 14, 2012 12:57 pm
Re: Cacti seems to have lost contact with nodes....
One more thing, here is a debug of the Graph Management. What you see here is what is displayed for every device:
RRDTool Command:
/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-60 \
--title="rtblmnaar01::Bloomington, Illinois - Traffic - Mu1" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="bits per second" \
--slope-mode \
--font TITLE:10: \
--font AXIS:8: \
--font LEGEND:8: \
--font UNIT:8: \
DEF:a="/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd":traffic_in:AVERAGE \
DEF:b="/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd":traffic_out:AVERAGE \
CDEF:cdefa=a,8,* \
CDEF:cdefe=b,8,* \
AREA:cdefa#00CF00FF:"Inbound" \
GPRINT:cdefa:LAST:" Current\:%8.2lf %s" \
GPRINT:cdefa:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefa:MAX:"Maximum\:%8.2lf %s\n" \
LINE1:cdefe#002A97FF:"Outbound" \
GPRINT:cdefe:LAST:"Current\:%8.2lf %s" \
GPRINT:cdefe:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefe:MAX:"Maximum\:%8.2lf %s"
RRDTool Says:
ERROR: opening '/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd': No such file or directory
RRDTool Command:
/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-60 \
--title="rtblmnaar01::Bloomington, Illinois - Traffic - Mu1" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="bits per second" \
--slope-mode \
--font TITLE:10: \
--font AXIS:8: \
--font LEGEND:8: \
--font UNIT:8: \
DEF:a="/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd":traffic_in:AVERAGE \
DEF:b="/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd":traffic_out:AVERAGE \
CDEF:cdefa=a,8,* \
CDEF:cdefe=b,8,* \
AREA:cdefa#00CF00FF:"Inbound" \
GPRINT:cdefa:LAST:" Current\:%8.2lf %s" \
GPRINT:cdefa:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefa:MAX:"Maximum\:%8.2lf %s\n" \
LINE1:cdefe#002A97FF:"Outbound" \
GPRINT:cdefe:LAST:"Current\:%8.2lf %s" \
GPRINT:cdefe:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefe:MAX:"Maximum\:%8.2lf %s"
RRDTool Says:
ERROR: opening '/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd': No such file or directory
Re: Cacti seems to have lost contact with nodes....
Fix this first:
SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired'
Then check this:
ERROR: opening '/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd': No such file or directory
SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired'
Then check this:
ERROR: opening '/var/www/html/rra/rtblmnaar01bloomington_illinois__traffic_in_1365.rrd': No such file or directory
--
Live fast, die young
You're sucking up my bandwidth.
J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
Live fast, die young
You're sucking up my bandwidth.
J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
-
- Posts: 7
- Joined: Wed Mar 14, 2012 12:57 pm
Re: Cacti seems to have lost contact with nodes....
Thanks Linegod. One question, for the: SPINE: Poller[0] FATAL: MySQL Error:'145', Message:'Table './cacti/poller_item' is marked as crashed and should be repaired'
is fixing this done by running repair_database script?
is fixing this done by running repair_database script?
Re: Cacti seems to have lost contact with nodes....
--
Live fast, die young
You're sucking up my bandwidth.
J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
Live fast, die young
You're sucking up my bandwidth.
J.P. Pasnak,CD
CCNA, LPIC-1
http://www.warpedsystems.sk.ca
Who is online
Users browsing this forum: No registered users and 6 guests