[solved] Graphs went all spiky after minor outage

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
mpb
Posts: 20
Joined: Mon Nov 06, 2006 5:46 am

[solved] Graphs went all spiky after minor outage

Post by mpb »

Hello All,

my cacti server has been running well for years.

today i had a small network outage such that the cacti machine could not access the network its monitoring for a few minutes. when network connectivity was restored the graosh resumed but there are showing a large number of spikes - see below

anyone has any idea what this may be caused by and how to fix it ?


cacti is version 0.86h running on freebsd 5.5
Attachments
graph_image.php.png
graph_image.php.png (23.3 KiB) Viewed 1248 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

I assume that this is 32 bit COUNTERs? Please try using 64 bit COUNTERs that don't show anomal behaviour when reaching 114 Mbps
R.
mpb
Posts: 20
Joined: Mon Nov 06, 2006 5:46 am

Post by mpb »

Hi Gandalf,

graphs are all set to 64 bit counters. the graph i attached as an example happened to be just about on 100Mb so I see why you asked :) I have other graphs graphing over 114Mbps and even Gbps of traffic with the same spiky behaviour as of a few hours.

for some reason the spiky behaviour is only for traffic graphs. other graphs showing say temperature (both via snmp from the same box) do not show spiky behaviour.

any ideas ?

thanks

Mark
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Now it's guesswork. 086h is way old.
See log/cacti.log for "something unusual". Or run cmd.php/cactid agaisnt a single host following closely what happens to the data.
Make sure, that only a single crontab entry for cacti exists.
R.
mpb
Posts: 20
Joined: Mon Nov 06, 2006 5:46 am

Post by mpb »

Hi,

did some further troubleshooting, and what I see for sure is that the data ending up in the rrd files is incorrect. see below

Also it looks like for some reason I cant figure out, its taking longer for the poller to do its job. the expected time is about 80 sec. now its going way over.

Can this tell me where else to look ?


thanks

Mark

06/16/2010 05:17:31 PM - SYSTEM STATS: Time:150.2764 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:6641
06/16/2010 05:24:53 PM - SYSTEM STATS: Time:293.0142 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:1
06/16/2010 05:26:52 PM - SYSTEM STATS: Time:110.9398 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:7740
06/16/2010 05:32:20 PM - SYSTEM STATS: Time:139.0097 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:4668
06/16/2010 06:02:25 PM - SYSTEM STATS: Time:143.4685 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:4202
06/16/2010 06:09:29 PM - SYSTEM STATS: Time:268.0389 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:4202
06/16/2010 06:11:59 PM - SYSTEM STATS: Time:117.7972 Method:cmd.php Processes:1 Threads:N/A Hosts:71 HostsPerProcess:71 DataSources:8743 RRDsProcessed:4202




<!-- 2010-06-15 10:00:00 CEST / 1276588800 --> <row><v> 2.3619965469e+08 </v><v> 4.7810775277e+07 </v></row>
<!-- 2010-06-15 10:05:00 CEST / 1276589100 --> <row><v> 2.3419900479e+08 </v><v> 4.8481514398e+07 </v></row>
<!-- 2010-06-15 10:10:00 CEST / 1276589400 --> <row><v> 2.4011402030e+08 </v><v> 5.1391038110e+07 </v></row>
<!-- 2010-06-15 10:15:00 CEST / 1276589700 --> <row><v> 2.4034531224e+08 </v><v> 5.0327747521e+07 </v></row>
<!-- 2010-06-15 10:20:00 CEST / 1276590000 --> <row><v> 2.4129712090e+08 </v><v> 4.9435658243e+07 </v></row>
<!-- 2010-06-15 10:25:00 CEST / 1276590300 --> <row><v> 2.3753689531e+08 </v><v> 4.8795402957e+07 </v></row>
<!-- 2010-06-15 10:30:00 CEST / 1276590600 --> <row><v> 2.4597816049e+08 </v><v> 5.0726966592e+07 </v></row>
<!-- 2010-06-15 10:35:00 CEST / 1276590900 --> <row><v> 4.9742343268e+08 </v><v> 9.9883570932e+07 </v></row>
<!-- 2010-06-15 10:40:00 CEST / 1276591200 --> <row><v> 2.9592055309e+07 </v><v> 5.9247270981e+06 </v></row>
<!-- 2010-06-15 10:45:00 CEST / 1276591500 --> <row><v> 2.6096192405e+08 </v><v> 5.3264312755e+07 </v></row>
<!-- 2010-06-15 10:50:00 CEST / 1276591800 --> <row><v> 5.0272748993e+08 </v><v> 1.0330178859e+08 </v></row>
<!-- 2010-06-15 10:55:00 CEST / 1276592100 --> <row><v> 1.3889777868e+07 </v><v> 2.8723356523e+06 </v></row>
<!-- 2010-06-15 11:00:00 CEST / 1276592400 --> <row><v> 2.6320644582e+08 </v><v> 5.5010108845e+07 </v></row>
<!-- 2010-06-15 11:05:00 CEST / 1276592700 --> <row><v> 3.4066310307e+08 </v><v> 6.8538768809e+07 </v></row>
<!-- 2010-06-15 11:10:00 CEST / 1276593000 --> <row><v> 1.9804826099e+08 </v><v> 3.9774774675e+07 </v></row>
<!-- 2010-06-15 11:15:00 CEST / 1276593300 --> <row><v> 2.8755628855e+08 </v><v> 5.8919661232e+07 </v></row>
<!-- 2010-06-15 11:20:00 CEST / 1276593600 --> <row><v> 3.7055610605e+08 </v><v> 8.0131198388e+07 </v></row>
<!-- 2010-06-15 11:25:00 CEST / 1276593900 --> <row><v> 1.9321939518e+08 </v><v> 4.1678656569e+07 </v></row>
<!-- 2010-06-15 11:30:00 CEST / 1276594200 --> <row><v> 2.8248551270e+08 </v><v> 6.0221882205e+07 </v></row>
mpb
Posts: 20
Joined: Mon Nov 06, 2006 5:46 am

Post by mpb »

hi,

figured it out eventually.

tuned some mysql parameters according to output from mysqltuner.pl

also since the box was up for something like 950 days, a reboot was in order to clear any possible memory leaks and such.

as much as i am NOT a fan of rebooting boxes, this time it sounds like it was really necessary. now its all back in business and runngin well


cheers
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests