Does anyone have any documentation on how to implement Cacti so that it interfaces with Nagios(NetSaint)?
I would be very interested in the configuration as I have a large Nagios installation and would like to utilize the performance data. I have found Nagios to be a very good product and believe that Cacti can add major benefits as well.
Jazz
Nagios/Cacti Performance Data
Moderators: Developers, Moderators
I think there is considerations in your request (talkning about two totally diferent development architetures).
Nowadays:
CACTI is a Statistics tool.
NAGIOS is a Fault Detection tool
In your previous post (http://www.raxnet.net/board/viewtopic.php?p=2131#2131), I stated that one of several NAGIOS plugins, NSCLIENT/CHECK_NT, is capable of grab a lot of aspects from a Windows machine without using SNMP, including various "Performance Counters" that can be graphed on CACTI.
This is the "Statistics" behavior of NSCLIENT/CHECK_NT.
The remaining functionality is totally target as "Fault Detection" tool. Have a look at documentation and you will see, almost all plugins, requires two parameters: -w to specify the Warning threshold and -c to specify the Critical threshold, targeting to generate Alarms.
You can do very limited intergration between them, as long as you use their own facilities, for example, to start "external" scripts (which in Cacti is superb).
For now, I don't see a good way to integrate Cacti things inside Nagios (except for clicking a device and showing the corresponding Graph in Cacti).
The opposite (Nagios things inside Cacti) could be (?!)
Maybe after an unsuccessfull Cacti polling, generate an alarm in Nagios. Or when a threshold for traffic is triggered, fires an alarm in Nagios.
I think Cacti/Nagios are the best open source tools, in their areas. To have both combined in a unique tool, is a dream.
Or a nightmare for commercial competitors (Spectrum, Concord, etc)
-Gilson
Nowadays:
CACTI is a Statistics tool.
NAGIOS is a Fault Detection tool
In your previous post (http://www.raxnet.net/board/viewtopic.php?p=2131#2131), I stated that one of several NAGIOS plugins, NSCLIENT/CHECK_NT, is capable of grab a lot of aspects from a Windows machine without using SNMP, including various "Performance Counters" that can be graphed on CACTI.
This is the "Statistics" behavior of NSCLIENT/CHECK_NT.
The remaining functionality is totally target as "Fault Detection" tool. Have a look at documentation and you will see, almost all plugins, requires two parameters: -w to specify the Warning threshold and -c to specify the Critical threshold, targeting to generate Alarms.
You can do very limited intergration between them, as long as you use their own facilities, for example, to start "external" scripts (which in Cacti is superb).
For now, I don't see a good way to integrate Cacti things inside Nagios (except for clicking a device and showing the corresponding Graph in Cacti).
The opposite (Nagios things inside Cacti) could be (?!)
Maybe after an unsuccessfull Cacti polling, generate an alarm in Nagios. Or when a threshold for traffic is triggered, fires an alarm in Nagios.
I think Cacti/Nagios are the best open source tools, in their areas. To have both combined in a unique tool, is a dream.
Or a nightmare for commercial competitors (Spectrum, Concord, etc)
-Gilson
Nagios/Cacti Integration
I agree that both these products are one of the best within the Open Source products. I believe that together they compliment each other in ways that make it very comparable to the commercial products.
What I am hoping to do is be able to utilize Nagios's plugins to return additional performance data and supply it to Cacti. This way you would be able to monitor the services and be able to see data trends through Cacti.
Nagios offers the possibility of this happening by having the "Performance Data Processing" turned on. Nagios would then be supplying the Performance Data to be processed by another application (which I am hoping to have Cacti fulfill.)
Nagios mentions that you should be able to have rrdtool squash the data and this is why I am hoping to be able to use Cacti.
I am really new to Cacti and just starting to see the true benefits that it offers. It appears to me that Cacti should be able to take the information supplied by Nagios and dump it into a database and produces the desire graphs.
Is there something that I am not understanding correctly?
If I have a better idea on how to complete this task then I should be able to build the appropriate plugins to make it so this happens.
As an example of what I am seeing. The basic level of implementation would be similar to the Apache web logs. Cacti is able to process the Apache logs and produce effective graphs. Nagios can create log files with the performance data it has retreived and then have Cacti called by crontab to process those log files to add it to the rrdtool database and build the graphs. I see the Apache logs and the Nagios logs as being the same type of process.
A better process would be to have Nagios call Cacti after it has received the performance data and have it automatically added into the rrdtool. This does have a down side in that it would have Nagios forking a lot of processes on large installations. (At the moment I have about 300 hosts and 1000 services that I am monitoring through Nagios.)
So let me know if I am out to lunch on this but I do see Nagios and Cacti have a nice fit together.
Jazz
What I am hoping to do is be able to utilize Nagios's plugins to return additional performance data and supply it to Cacti. This way you would be able to monitor the services and be able to see data trends through Cacti.
Nagios offers the possibility of this happening by having the "Performance Data Processing" turned on. Nagios would then be supplying the Performance Data to be processed by another application (which I am hoping to have Cacti fulfill.)
Nagios mentions that you should be able to have rrdtool squash the data and this is why I am hoping to be able to use Cacti.
I am really new to Cacti and just starting to see the true benefits that it offers. It appears to me that Cacti should be able to take the information supplied by Nagios and dump it into a database and produces the desire graphs.
Is there something that I am not understanding correctly?
If I have a better idea on how to complete this task then I should be able to build the appropriate plugins to make it so this happens.
As an example of what I am seeing. The basic level of implementation would be similar to the Apache web logs. Cacti is able to process the Apache logs and produce effective graphs. Nagios can create log files with the performance data it has retreived and then have Cacti called by crontab to process those log files to add it to the rrdtool database and build the graphs. I see the Apache logs and the Nagios logs as being the same type of process.
A better process would be to have Nagios call Cacti after it has received the performance data and have it automatically added into the rrdtool. This does have a down side in that it would have Nagios forking a lot of processes on large installations. (At the moment I have about 300 hosts and 1000 services that I am monitoring through Nagios.)
So let me know if I am out to lunch on this but I do see Nagios and Cacti have a nice fit together.
Jazz
Jazzee,
I agree that these two tools fit together nicely (check out integration between BMC Patrol, and Best1); however, I think there are better ways than nagios plugins to monitor Unix systems.
However, if you'd like to use those plugins to monitor a system, I would execute the script within a daemon, and continually dump it into an RRD. You would use Cacti to create/view your graphs, and Nagios to alarm on threshold alarms.
To do import the RRD into cacti, I wrote a dirty little script (I haven't had time to clean it up) that parses the RRD and imports it into cacti's MySQL database. You'd still need to create your graphs, and graph hierarchy.
http://www.birch.net/~spiegela/import_rrd_cacti
(btw if you find bugs let me know so I can check them out)
To alarm on thresholds in the latest nagios-plugins contrib dir theres a check_rrd_data.pl that allows you to search a RRD for alarmable regular expressions (I think), but it doesn't look very easy to use.
I'm almost finished writing a plugin to alarm on specific thresholds with one or more datasources. If multiple datasources are specified their datapoints will be added together and then compared against the warning and critical thresholds. I should have it finished some time next week, I just have to figure out how to add up all the values based on what the RRDs::fetch gives me.
As a quick side-note, I'm also almost finished with my own performance collector for Solaris, using the kstat.pm (that get kernel statistics straight from the kernel, like Best/1 does with c). The only part holding us up, is creating the RRDs over the network to the server I have with cacti & nagios.
Anyone else have ideas?
I agree that these two tools fit together nicely (check out integration between BMC Patrol, and Best1); however, I think there are better ways than nagios plugins to monitor Unix systems.
However, if you'd like to use those plugins to monitor a system, I would execute the script within a daemon, and continually dump it into an RRD. You would use Cacti to create/view your graphs, and Nagios to alarm on threshold alarms.
To do import the RRD into cacti, I wrote a dirty little script (I haven't had time to clean it up) that parses the RRD and imports it into cacti's MySQL database. You'd still need to create your graphs, and graph hierarchy.
http://www.birch.net/~spiegela/import_rrd_cacti
(btw if you find bugs let me know so I can check them out)
To alarm on thresholds in the latest nagios-plugins contrib dir theres a check_rrd_data.pl that allows you to search a RRD for alarmable regular expressions (I think), but it doesn't look very easy to use.
I'm almost finished writing a plugin to alarm on specific thresholds with one or more datasources. If multiple datasources are specified their datapoints will be added together and then compared against the warning and critical thresholds. I should have it finished some time next week, I just have to figure out how to add up all the values based on what the RRDs::fetch gives me.
As a quick side-note, I'm also almost finished with my own performance collector for Solaris, using the kstat.pm (that get kernel statistics straight from the kernel, like Best/1 does with c). The only part holding us up, is creating the RRDs over the network to the server I have with cacti & nagios.
Anyone else have ideas?
Some serious netsaint_statd/nagios stuff is here!!!
First off - Sorry if I seem annoyed..... I've posted this description several times and to an extent it seems like no one notices. I have submited the code and it has disappeared (meaning not posted by RAX to this site for evryone elses sharing). I do not know why. I have put about 4 months of effort into this and it works damn well. Anyway. I cannot actually site Rax with some blame in this as he's probably plenty busy and has his focus elsewhere. All I wanted to do was post some code so all could share and this site provides no way to do that. All this because cacti is a totally great tool and this form of data collection works very well.
Another description of this is here - http://www.raxnet.net/board/viewtopic.php?t=585
I have recently worked in a US Bank where we used netsaint/nagios as a system monitoring tool. This was working well for fault detection. I was asked to find a way to start collecting data for performance data. We wanted to be able to visualize whats really going on on all the machines.
We elected to write more functions into the netsaint_statd perl script that executes on all the remote machines and have a small perl script on the monitoring server that would query netsaint_statd for some item. It would send the command to the machines netsaint_statd daemon and get back the answer. Whether this was cpu usage via vmstat of system load via uptime or disk io via iostat or the size of a log file. We would get back the reply and cacti would stuff the data into the rrd files. We made no changes to cacti to integrate netsaint/nagios. We just used the same netsaint_statd daemon for netsaint monitoring and for cacti monitoring.
THis idea works very well because:
With cacti as distributed, you have to write all your own little scripts in the scripts directory. This is a royal pain in the butt. They're all different and awkward to maintain. They do little in the way of efficiently allowing collection of data from a remote source. For example, seeing others write a script to ssh to a remote machine to collect the uptime or system load seems to be pretty high ovverhead to me. And you do not want the collection of data to impact your measurements on the remote system. Scale all this up to many many systems and it's not practical. In any case, you need all kinds of platform specific code to collect what you want from the remote systems. Adding these kinds of functions into the netsaint_statd perls script proves to be great!!!! It keeps the stuff in a single place and allows easy distribution to the remote machines. All the various techniques of programming the data collection is in the one script so it gets continuously easier to add new functions (well sort of anyway). You are using a single source (daemon) on the remote machine for netsaint/nagios AND Cacti to talk to.
The netsaint/nagios daemon code only has enough functions built in to work for the kinds of stuff that netsaint/nagios queries. We have added many additional function for the primary use of cacti queries, but netsaint/nagios could also query the same functions if deemed suitable.
I am totally happy **REALLY** to share all this code. I have a tar archive that's easilly emailed if anyone wants it. The archive has a couple of tiny perl utility scripts, but the main thing is a **drop in** replacement netsaint_statd perl daemon script. ALL extension are for AIX, HP-UX and SunOS. I'll get to doing the Linux (Redhat flavor) stuff soonish (enthusiasticly - for sure) sometime this summer yet.
You are best to directly email me and I will email you the tar file. It's less than 50Kbytes and is for Unixs. This system does nothing to collect data from Windoz machines.
The email address is integr8er AT ameritech DOT net. Allow some time to reply to you though. I look forward to others using this method and hearing about how it works et. al.
First off - Sorry if I seem annoyed..... I've posted this description several times and to an extent it seems like no one notices. I have submited the code and it has disappeared (meaning not posted by RAX to this site for evryone elses sharing). I do not know why. I have put about 4 months of effort into this and it works damn well. Anyway. I cannot actually site Rax with some blame in this as he's probably plenty busy and has his focus elsewhere. All I wanted to do was post some code so all could share and this site provides no way to do that. All this because cacti is a totally great tool and this form of data collection works very well.
Another description of this is here - http://www.raxnet.net/board/viewtopic.php?t=585
I have recently worked in a US Bank where we used netsaint/nagios as a system monitoring tool. This was working well for fault detection. I was asked to find a way to start collecting data for performance data. We wanted to be able to visualize whats really going on on all the machines.
We elected to write more functions into the netsaint_statd perl script that executes on all the remote machines and have a small perl script on the monitoring server that would query netsaint_statd for some item. It would send the command to the machines netsaint_statd daemon and get back the answer. Whether this was cpu usage via vmstat of system load via uptime or disk io via iostat or the size of a log file. We would get back the reply and cacti would stuff the data into the rrd files. We made no changes to cacti to integrate netsaint/nagios. We just used the same netsaint_statd daemon for netsaint monitoring and for cacti monitoring.
THis idea works very well because:
With cacti as distributed, you have to write all your own little scripts in the scripts directory. This is a royal pain in the butt. They're all different and awkward to maintain. They do little in the way of efficiently allowing collection of data from a remote source. For example, seeing others write a script to ssh to a remote machine to collect the uptime or system load seems to be pretty high ovverhead to me. And you do not want the collection of data to impact your measurements on the remote system. Scale all this up to many many systems and it's not practical. In any case, you need all kinds of platform specific code to collect what you want from the remote systems. Adding these kinds of functions into the netsaint_statd perls script proves to be great!!!! It keeps the stuff in a single place and allows easy distribution to the remote machines. All the various techniques of programming the data collection is in the one script so it gets continuously easier to add new functions (well sort of anyway). You are using a single source (daemon) on the remote machine for netsaint/nagios AND Cacti to talk to.
The netsaint/nagios daemon code only has enough functions built in to work for the kinds of stuff that netsaint/nagios queries. We have added many additional function for the primary use of cacti queries, but netsaint/nagios could also query the same functions if deemed suitable.
I am totally happy **REALLY** to share all this code. I have a tar archive that's easilly emailed if anyone wants it. The archive has a couple of tiny perl utility scripts, but the main thing is a **drop in** replacement netsaint_statd perl daemon script. ALL extension are for AIX, HP-UX and SunOS. I'll get to doing the Linux (Redhat flavor) stuff soonish (enthusiasticly - for sure) sometime this summer yet.
You are best to directly email me and I will email you the tar file. It's less than 50Kbytes and is for Unixs. This system does nothing to collect data from Windoz machines.
The email address is integr8er AT ameritech DOT net. Allow some time to reply to you though. I look forward to others using this method and hearing about how it works et. al.
My apologies for not posting this sooner.
I apparently moved it to an 'In Progress' folder a while back and never looked at it since.
Either way, I posted it to the 'Additional Scripts' section of the cacti page. You can download is via this URL:
http://www.raxnet.net/downloads/scripts ... int.tar.gz
Thanks for your hard work, it looks like you put a lot of time into making this work.
-Ian
I apparently moved it to an 'In Progress' folder a while back and never looked at it since.
Either way, I posted it to the 'Additional Scripts' section of the cacti page. You can download is via this URL:
http://www.raxnet.net/downloads/scripts ... int.tar.gz
Thanks for your hard work, it looks like you put a lot of time into making this work.
-Ian
Who is online
Users browsing this forum: No registered users and 2 guests