Hello all,
I applied thold to the devices and was able to receive the alert email informing which device was down. But I noticed that there's a quite long time gap(some could be up to 11mins) from the time cacti sent an email to me to the time I received the email...May I ask what caused the delay in sending the alert message?
Thanks a lot!
Saya
long time gap to receive the alert email from cacti
Moderators: Developers, Moderators
- Howie
- Cacti Guru User
- Posts: 5508
- Joined: Thu Sep 16, 2004 5:53 am
- Location: United Kingdom
- Contact:
Re: long time gap to receive the alert email from cacti
Have you checked where the delay occurred? The mail headers will tell you when the first MTA in the chain (the Cacti server's mail server) sent the message.saya wrote:Hello all,
I applied thold to the devices and was able to receive the alert email informing which device was down. But I noticed that there's a quite long time gap(some could be up to 11mins) from the time cacti sent an email to me to the time I received the email...May I ask what caused the delay in sending the alert message?
Thanks a lot!
Saya
It could be a delay in your mail system rather than in Cacti/Thold sending mail...
Also, if you have set your poller ping_failure_count to more than 1 (perhaps 2 here?) then it will take that many poller cycles to detect a down host.
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
- Howie
- Cacti Guru User
- Posts: 5508
- Joined: Thu Sep 16, 2004 5:53 am
- Location: United Kingdom
- Contact:
Well, try some of the checks I suggestedsaya wrote:hello Howie
I'm not sure where the problem is...
The email is only saying "Host Notice : DEVICE-NAME(device-ip) returned from DOWN state"...
If the delay is caused by my main system, does it mean I dun't have any method to cut down the delay?
That way you at least would know if it's your mail system or Cacti that is delaying. And then, if it is Cacti, you can look at the Poller settings to see if you have Failure Count set to higher that 1.
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
Hello Howie, thank you a lot for your reply...
I monitored my device and drawed a graph on the up/down situation and time when receives email from cacti..
for the above chart, the failure count is set to 3. You can see the time gap for sending and receiving email as my above two post said is not so obvious...but there is quite long gap from time a switch got down/up to time cacti sent this down/up alert to me...especially the one at 10:49am in red(around 9 min)..the fastest reply could be within 1 min like the one at 12:11pm(bottom chart) which was less than 1 min after it returned back from down state..
for the bottom chart, both failure count and recovery count were set to 1. (I wonder why there's no recovering state but up state in stead..)And unluckily, two down alerts were missed(marked in red where they should have be)...
So my question is:
1, what might cause the slow reply(from the time a device triggered from up tp down to the time cacti send this event to me)?set failure count is only like to make cacti refresh device status more quickly but not shorten the gap actually..
2, what might cause the loss of alerts?
Thank you for ur reply!!
Regards,
Saya
I monitored my device and drawed a graph on the up/down situation and time when receives email from cacti..
for the above chart, the failure count is set to 3. You can see the time gap for sending and receiving email as my above two post said is not so obvious...but there is quite long gap from time a switch got down/up to time cacti sent this down/up alert to me...especially the one at 10:49am in red(around 9 min)..the fastest reply could be within 1 min like the one at 12:11pm(bottom chart) which was less than 1 min after it returned back from down state..
for the bottom chart, both failure count and recovery count were set to 1. (I wonder why there's no recovering state but up state in stead..)And unluckily, two down alerts were missed(marked in red where they should have be)...
So my question is:
1, what might cause the slow reply(from the time a device triggered from up tp down to the time cacti send this event to me)?set failure count is only like to make cacti refresh device status more quickly but not shorten the gap actually..
2, what might cause the loss of alerts?
Thank you for ur reply!!
Regards,
Saya
- Attachments
-
- alert email
- time gap1.jpg (178.5 KiB) Viewed 13984 times
Re:
Have you fixed that issue, I am also having the same issue as your post. I have posted on Cacti forum that no one has only me this problem. If you have fixed your error please help me.saya wrote: ↑Wed Feb 20, 2008 1:59 am Hello Howie, thank you a lot for your reply...
I monitored my device and drawed a graph on the up/down situation and time when receives email from cacti..
for the above chart, the failure count is set to 3. You can see the time gap for sending and receiving email as my above two post said is not so obvious...but there is quite long gap from time a switch got down/up to time cacti sent this down/up alert to me...especially the one at 10:49am in red(around 9 min)..the fastest reply could be within 1 min like the one at 12:11pm(bottom chart) which was less than 1 min after it returned back from down state..
for the bottom chart, both failure count and recovery count were set to 1. (I wonder why there's no recovering state but up state in stead..)And unluckily, two down alerts were missed(marked in red where they should have be)...
So my question is:
1, what might cause the slow reply(from the time a device triggered from up tp down to the time cacti send this event to me)?set failure count is only like to make cacti refresh device status more quickly but not shorten the gap actually..
2, what might cause the loss of alerts?
Thank you for ur reply!!
Regards,
Saya
Re: long time gap to receive the alert email from cacti
Me too, have problem same please share solution if this problem solved
Who is online
Users browsing this forum: No registered users and 1 guest