Cacti Alert Differently for Core and Customer Downed Devices

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
User avatar
willieb
Cacti User
Posts: 160
Joined: Thu Jan 22, 2009 10:09 am
Location: South GA

Cacti Alert Differently for Core and Customer Downed Devices

Post by willieb »

Here I discuss using Advanced Ping and Thold but it doesn't have anything to do with my problem. It's simply how Cacti was designed, that's why I posted this in the General forum.

I am attempting to use Cacti as a 1 stop shop for network monitoring. This includes 1.) historical trending/graphing and 2.) up/down status monitoring and alerting for core network and customer devices.

Ok you know how Cacti has Dead Host Notifications and alerting via thold right? It works great, however I want to setup alerting for core devices after 3 minutes, and alerting for customer equipment after 8 minutes. For Dead Hosts Notification there's only 1 setting under Settings=> Poller called "Failure Count".

So as a solution I thought it would be best to alert based on a threshold template from advanced ping, standard ping, etc. So I tried and it and unfortunately it doesn't work. I've tried different versions of Advanced Ping 1.3, 2.2, etc. After researching a couple days I found out why. When SNMP is enabled, if Cacti doesn't get a response via SNMP, it doesn't poll for anything for that device. So the graph is blank, no packet loss is reported, so no alerts. It's not a problem with Cacti, it's just how it was designed. I tried disabling "Downed Device Detection" to no avail. The only way I could get it to report 100% packet loss for Advanced Ping to alert on was to disable SNMP all together. Of course that won't work if I want to graph anything that required SNMP so that's pretty much out.

I thought about using Dead Hosts Notifications with a failure count of 3 for core devices and setup non-SNMP customer devices with ping only graphs for alerting via thold at 8 minutes of 100% packet loss. I have 2 problems with that. If I want to graph anything else later using SNMP I can't. And the most important reason I can't is I am using Discover to find new CPE devices which depends on SNMP.

So with all that said what choices do I have to alert for downed devices and devices that recover, using 3 minutes down alerting for core network devices and 8 minutes down alerting for CPE devices? Immediate recovery is fine.

I really like Advanced Ping especially with the packet loss lines but if Cacti doesn't poll for downed devices then I don't see how I can use any graph or returned data from any script or template since polling stops

I've been playing with Nagios on the side and it would work great, but it would be even better if I could accomplish the same thing with Cacti. Nagios just seems too complex for our needs.

Can you guys think of another way to accomplish different timed alerts for different types of devices with Cacti?

Thanks for any help you can provide.
-willieb
User avatar
willieb
Cacti User
Posts: 160
Joined: Thu Jan 22, 2009 10:09 am
Location: South GA

Re: Cacti Alert Differently for Core and Customer Downed Dev

Post by willieb »

One of my network engineers has pointed out a solution. It's a bit dirty so to speak but I have tested it and it works.

Add a duplicate host. For the new host disable "Downed Device Detection", and change SNMP Version to "Not In Use". Start graphing for Advanced Ping. Setup a threshold to alert on >90% packet loss, or whatever you like. When this device goes down, the graph will show red lines for 100% packet loss and the alert will trigger.

If you want to exclusively use packet loss to alert when down/up, you will probably want to disable "Thold Up/Down Email Notification" and remove the Advanced Ping graph on the other (original host) that includes snmp.

It's a bit dirty since it will show duplicate hosts in devices. Luckily it should be quite obvious which is the ping only device because it will only have 1 graph. It will probably make sense to append "- Ping" or something of the sort to the end on the new host. Also it shouldn't be more processor intensive since no initial pings or snmp checks are done.

If you have any other suggestions/solutions I'm all ears...
-willieb
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests