Feature Request: Alert Dependencies
Moderators: Developers, Moderators
Feature Request: Alert Dependencies
Now this is sort of a problem with both monitor and threshold.
Lets say I am monitoring a router, servers and switches in a remote locations and my wan connection goes down.
What happens now is I get an alert for every device that has a low threshold.. when really all I need is a single alert for the router since in terms of alerting all these devices depend on it for connectivity to the monitoring system.
the same could be said for the switches and the servers.
The simplest way I can think to solve this would be to add a new parent attribute to each host or threshold graph.
Threshold would then do two passes before sending alerts.. First it would determine the alert state of all devices, then it would go through them to send alerts.. If the device has no parent and alert is sent, if it has a parent it will check the parents alert state, if the parent has an alert no alert is sent for that device on that pass.
For hosts with multiple graphs I could see the host it self being an automatic implied parent, thus when a graph goes to send an alert if the host is down, only a single host alert would be sent.
We have a lot of remote sites and equipment, so having our on call techs hit with 20 alert notices when something like the main MPLS link goes down is a bit of a hassle and sort of clouds the real problem momentarily.
Lets say I am monitoring a router, servers and switches in a remote locations and my wan connection goes down.
What happens now is I get an alert for every device that has a low threshold.. when really all I need is a single alert for the router since in terms of alerting all these devices depend on it for connectivity to the monitoring system.
the same could be said for the switches and the servers.
The simplest way I can think to solve this would be to add a new parent attribute to each host or threshold graph.
Threshold would then do two passes before sending alerts.. First it would determine the alert state of all devices, then it would go through them to send alerts.. If the device has no parent and alert is sent, if it has a parent it will check the parents alert state, if the parent has an alert no alert is sent for that device on that pass.
For hosts with multiple graphs I could see the host it self being an automatic implied parent, thus when a graph goes to send an alert if the host is down, only a single host alert would be sent.
We have a lot of remote sites and equipment, so having our on call techs hit with 20 alert notices when something like the main MPLS link goes down is a bit of a hassle and sort of clouds the real problem momentarily.
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Formal Noun is "Event Correlation". I'm not too sure how this would be done. How does Nagio's do it?
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
No idea. I am looking at it for replacing "whats up". Whats up basically lets you create a dependency link to a parent device, then ignores the child alert if the parent is down.TheWitness wrote:Formal Noun is "Event Correlation". I'm not too sure how this would be done. How does Nagio's do it?
TheWitness
You link a server, to a switch and the switch to a router, which may be linked to a master router..
If a parent is down, none of its children or children children send alerts.
- Howie
- Cacti Guru User
- Posts: 5508
- Joined: Thu Sep 16, 2004 5:53 am
- Location: United Kingdom
- Contact:
Event Correlation is more about time series. E.g. if you got a down from device A, but an up from device B, then there is no alert, but if you only get one of those two events in a given period, then there is a problem.
Dependencies would even be handy in the poller - no point in trying to poll that remote site until the WAN router comes back up, either. The simple WUG style of "don't poll if X is down" and "don't poll is X is up" seems like it wouldn't be too hard to do...
Dependencies would even be handy in the poller - no point in trying to poll that remote site until the WAN router comes back up, either. The simple WUG style of "don't poll if X is down" and "don't poll is X is up" seems like it wouldn't be too hard to do...
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
That makes sense, however, you might have to go a little bit further. Say for example. Don't poll if either X is down or X.ifIndex is down, or in a case where you have dual entrances/routers etc, Don't poll if either X is down or X.ifIndexA and X.ifIndexB are down.
This is one class of Event correlation. Can you guys think of any other? How would we handle DNS outages for example?
TheWitness
This is one class of Event correlation. Can you guys think of any other? How would we handle DNS outages for example?
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
If you link everything by thold graph then you leave it flexible to the user.
Ie you link server:eth0 to switch:port4 to router:eth1 for example.
As for dns you could have a polled cannary test where the polled tries a known good local dns value, if it fails it alams on dns and ignores any hosts that are not ip based.
Ie you link server:eth0 to switch:port4 to router:eth1 for example.
As for dns you could have a polled cannary test where the polled tries a known good local dns value, if it fails it alams on dns and ignores any hosts that are not ip based.
-
- Posts: 24
- Joined: Mon Jun 14, 2010 1:51 pm
I think we can use the simple "critical monitor" on WUG. Critical monitors are usually PING from WUG. This means if the PING fails on a device, WUG will no longer poll the interfaces of that device because it will surely fail.
Lets start on this since this is the simpler. I'm pretty sure you can do it guys since cacti already has STATUS check on each device. If status check results to DOWN state, then cacti wont poll DS of that device.
Thanks!
Lets start on this since this is the simpler. I'm pretty sure you can do it guys since cacti already has STATUS check on each device. If status check results to DOWN state, then cacti wont poll DS of that device.
Thanks!
Re: Feature Request: Alert Dependencies
G'day
Sorry to revive a dead thread but couldn't see any other posts regarding this.
Has this since been implemented into Cacti?
I'm also currently in the same boat as the OP where I have multiple sites and if the link between said sites is down, flooding of my inbox ensues! My only option that I'm aware of currently is to set up multiple Cacti monitors at each site, however this is the less than graceful solution as there really should be a way to run it all from one location with the appropriate correlation occuring behind the scenes.
Please advise on whether this issue has been corrected/implemented.
Cheers,
Adrian Apps.
Sorry to revive a dead thread but couldn't see any other posts regarding this.
Has this since been implemented into Cacti?
I'm also currently in the same boat as the OP where I have multiple sites and if the link between said sites is down, flooding of my inbox ensues! My only option that I'm aware of currently is to set up multiple Cacti monitors at each site, however this is the less than graceful solution as there really should be a way to run it all from one location with the appropriate correlation occuring behind the scenes.
Please advise on whether this issue has been corrected/implemented.
Cheers,
Adrian Apps.
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Feature Request: Alert Dependencies
To my konwledge, this hasn't been implemented thus far
R.
R.
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Feature Request: Alert Dependencies
No, it has not, but it's a simple option to 'Disable Threshold Notifications When Host is Down'. Checkbox. Combined with the most recent Maintenance plugin options would be a 'snap' to implement (like <= 20 minutes).
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Feature Request: Alert Dependencies
Thresholds should already not alert if the host is down. The only real issue is if you have thresholds that alert when they go below a specific value, and if they alert faster than the down host interval you have set in cacti. For instance, your threshold alerts every minute, but in cacti you have it set so the host doesnt count as down until 2 minutes.
It won't be easy to add in "criticals" effectively until Cacti has groups.
It won't be easy to add in "criticals" effectively until Cacti has groups.
-
- Cacti User
- Posts: 141
- Joined: Thu Apr 10, 2008 6:52 pm
Re: Feature Request: Alert Dependencies
I think alert dependencies would be a nice feature. Each device could have a parent/child relationship.
Then dependency rules could be used:
- If parent device has an active thold, then do not alert on child
- If child device has an active alert then do not alert
etc....
Then dependency rules could be used:
- If parent device has an active thold, then do not alert on child
- If child device has an active alert then do not alert
etc....
Re: Feature Request: Alert Dependencies
Hi Guys,
I'm new to Cacti and I'm looking for a way to manage device dependencies and/or Event Correlation.
Is there any update on this?
I've googled it but found nothing relevant for cacti.
Many thanks in advance.
I'm new to Cacti and I'm looking for a way to manage device dependencies and/or Event Correlation.
Is there any update on this?
I've googled it but found nothing relevant for cacti.
Many thanks in advance.
Who is online
Users browsing this forum: No registered users and 0 guests