discontinuous graphs!

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
mongaron
Posts: 7
Joined: Mon Jan 15, 2007 5:30 am

discontinuous graphs!

Post by mongaron »

hi,
for a long time now most of my machines graphs are being discontinuous while other hosts arent .
the machines are at the same farm and subnet.
when i enter graph management and turn on debug mode i get

RRDTool Says:
OK

though the graph is discontinues for acouple of hours/minutes and then comes back to live.
i have enabled DEBUG mode on my logs and found only this strange errors that comes every once and then but not but not always corrolating with the time the graphs stop showing here are the errors:
07/01/2008 11:49:56 AM - CMDPHP: Poller[0] DEBUG: SQL Exec: "update host set status = '3', status_event_count = '0', status_fail_date = '2008
-06-18 13:10:43', status_rec_date = '2008-06-19 10:15:44', status_last_error = 'Host did not respond to SNMP', min_time = '1.25000', max_time
= '990.18000', cur_time = '120.94', avg_time = '106.84190550655', total_polls = '45035', failed_polls = '350', availability = '99.2228266903
52' where hostname = '10.10.0.11'"


07/01/2008 01:01:02 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "update host set status = '3', status_event_count = '0', status_fail_date = '2008
-06-18 13:10:43', status_rec_date = '2008-06-19 10:15:44', status_last_error = 'Host did not respond to SNMP', min_time = '1.25000', max_time
= '990.18000', cur_time = '89.41', avg_time = '106.84751989016', total_polls = '45050', failed_polls = '350', availability = '99.22308546059
9' where hostname = '10.10.0.11'"


can someone please tell me from where i should start digging this strange behavior.

thanks in advanced
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please see 2nd link of my signature. You may want to increase the SNMP timeout for failing hosts and/or verify downed host detection method
Reinhard
mongaron
Posts: 7
Joined: Mon Jan 15, 2007 5:30 am

Post by mongaron »

gandalf thanks for your reply
i have increased SNMP timeout from 500 to 2000 without any changes. my downed host detection method is snmp and i disable it but it gave me no results.
i was thinking that maybe i have some network distruption that could casue this.
after talking to my network guy without any results.
i was thinking can i configure cacti at my server farm that will collect all the data and send it to a cacti server in my office or the cacti at my office will pull that data?

thanks in advanced
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Is there a huge rtt between the polling host and the failing "subnet"? The log entry you've shown does not show the root cause; there should have been other errors related to the failing host before.
Reinhard
mongaron
Posts: 7
Joined: Mon Jan 15, 2007 5:30 am

Post by mongaron »

as i suspected when i transfered cacti inside the farm the problem solved.
thanks anyway for your help gandalf
takobaba
Posts: 7
Joined: Mon Aug 05, 2013 10:22 am

Re: discontinuous graphs!

Post by takobaba »

what is farm i am having same issue
donglee
Posts: 24
Joined: Thu Dec 05, 2013 7:54 pm

Re: discontinuous graphs!

Post by donglee »

A possible root cause is the "max data source issue" described (among others) at 2nd link of my sig
Depending on the root cause, there are situations where a warning is thrown (e.g. if this is casued by a timeout or a non-responsive host). In case it is caused by the "max" issue, it is assumed that the max has been set on purpose, hence no need for any warning
Regards,
Dong

Network Engineer
http://www.3Anetwork.com
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests