Extended system stats graphing- vmstat, iostat, mpstat (sar)
Moderators: Developers, Moderators
Extended system stats graphing- vmstat, iostat, mpstat (sar)
I've created some wrapper scripts and templates for fetching and graphing extended system information data in cacti. I'm looking to release this to the public but it's not quite polished up yet. I thought I'd post up some pictures of the graphs and gauge how well received the project is.
I'm not rushing to post it for one main reason: I've leveraged NRPE over SNMP for fetching and returning data for two primary reasons:
1: SNMP sucks (we run a lot of system checks and graph lots of statistics, and this udp-based, non-threaded solution is too much of a risk, which doesnt scale well)
2: Simplifies management overhead in the event that I choose to monitor any of the graphed values if they are misbehaving (we run a heavy nagios shop over here)
Basically, on the host side you have to install and configure nrpe, and add in the commands. There are wrapper scripts on the cacti host side which make the call to check_nrpe and subsequently the call to fetch the data from the remote host. The output from nrpe is practically raw *stat output, which is then parsed into cacti-friendly values for the poller to pickup.
So far I am graphing the following stats:
CPU Usage \
Memory Usage / I have made some new graph templates for these which kick the UCD/Net snmp graphs ASSES. so hard.
Overall Disk Activity
Free Directory Cache Entries
Context Switches
Open file Handles
Open Sockets
Process Creation
Process Count
Run Queue
Memory Cache Hit ratio
and on the disk end I'm working on a script query to return full stats about disks including:
Average Queue Size
Average Request size (in blocks)
Average Wait time
Average Service time
Read/Write kB
Read/Write sectors
Read/Write Operations
Read/Write requests merged
CPU Utilisation for read/write
I'm currently working to fix up the iostat based query, and then going to do some general code cleanup before releasing a beta. if anyone is interested in this, let me know and perhaps I can hook up my alpha code for your hacking pleasures.
I'm not rushing to post it for one main reason: I've leveraged NRPE over SNMP for fetching and returning data for two primary reasons:
1: SNMP sucks (we run a lot of system checks and graph lots of statistics, and this udp-based, non-threaded solution is too much of a risk, which doesnt scale well)
2: Simplifies management overhead in the event that I choose to monitor any of the graphed values if they are misbehaving (we run a heavy nagios shop over here)
Basically, on the host side you have to install and configure nrpe, and add in the commands. There are wrapper scripts on the cacti host side which make the call to check_nrpe and subsequently the call to fetch the data from the remote host. The output from nrpe is practically raw *stat output, which is then parsed into cacti-friendly values for the poller to pickup.
So far I am graphing the following stats:
CPU Usage \
Memory Usage / I have made some new graph templates for these which kick the UCD/Net snmp graphs ASSES. so hard.
Overall Disk Activity
Free Directory Cache Entries
Context Switches
Open file Handles
Open Sockets
Process Creation
Process Count
Run Queue
Memory Cache Hit ratio
and on the disk end I'm working on a script query to return full stats about disks including:
Average Queue Size
Average Request size (in blocks)
Average Wait time
Average Service time
Read/Write kB
Read/Write sectors
Read/Write Operations
Read/Write requests merged
CPU Utilisation for read/write
I'm currently working to fix up the iostat based query, and then going to do some general code cleanup before releasing a beta. if anyone is interested in this, let me know and perhaps I can hook up my alpha code for your hacking pleasures.
- Attachments
-
- graphs.png (58.34 KiB) Viewed 45451 times
-
- graphs1.png (66.67 KiB) Viewed 45451 times
-
- graphs2.png (132.92 KiB) Viewed 45451 times
-
- Posts: 1
- Joined: Tue Jul 03, 2007 11:21 am
- Location: England
So, this has been working pretty stable for the past few weeks for me. I'm going to do a little how-to writeup and post this up here later today. Let me know how this goes for all, remember this requires the following:
Sysstat package (RH RPM: sysstat) -- primarily the sar utility
Nagios NRPE Plugin -- http://sourceforge.net/projects/nrpe
The 'framework' is pretty general and can be adapted to use most of the *stat utilities, as I've done for the iostat utility, but have yet to get it working to my liking.
Sysstat package (RH RPM: sysstat) -- primarily the sar utility
Nagios NRPE Plugin -- http://sourceforge.net/projects/nrpe
The 'framework' is pretty general and can be adapted to use most of the *stat utilities, as I've done for the iostat utility, but have yet to get it working to my liking.
Sarparse 0.1 INITIAL RELEASE
sorry all for the delay -- I've had a few successful test cases with little to no issues on installs, and am now comfortable to release... enjoy!
-adam
well It looks as if I can not post a tar ball to the site.. private message me with your email address and I will send you the latest until I have a hosting solution
-adam
well It looks as if I can not post a tar ball to the site.. private message me with your email address and I will send you the latest until I have a hosting solution
sorry for the massive delay in response... tarball is hosted here:
http://frylab.com/~adamb/sarparse/
let me know how your experience goes
http://frylab.com/~adamb/sarparse/
let me know how your experience goes
Question:
Why does it only return one line of text? For example:
Why does it only return one line of text? For example:
When I run the actual command on the target host I get:/usr/lib64/nagios/plugins/check_nrpe -H myhost -c sar
Average: proc/s
From the target host in npre.cfg:/usr/bin/sar -Brcquwv -n SOCK 1 1|/bin/grep Average
Average: proc/s
Average: 0.00
Average: cswch/s
Average: 0.00
Average: CPU %user %nice %system %iowait %idle
Average: all 57.21 0.00 11.94 0.00 30.85
Average: pgpgin/s pgpgout/s fault/s majflt/s
Average: 0.00 0.00 14.00 0.00
Average: kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
Average: 2834784 1316492 31.71 39396 357016 6241024 96608 1.52 20628
Average: dentunusd file-sz inode-sz super-sz %super-sz dquot-sz %dquot-sz rtsig-sz %rtsig-sz
Average: 44639 1860 42772 0 0.00 0 0.00 0 0.00
Average: totsck tcpsck udpsck rawsck ip-frag
Average: 292 49 84 0 0
Average: runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
Average: 5 176 1.17 1.33 1.50
# Cacti calls
command[sar]=/usr/bin/sar -Brcquwv -n SOCK 1 1|/bin/grep Average
command[iostat]=/usr/bin/iostat -dx|/usr/bin/tail -n +3|/usr/bin/head -n -1
command[sarparse_cpustat]=/usr/bin/sar -P ALL 1 1 |/bin/grep Average |/bin/grep -v all
Because NRPE only returns the first line of output.marcmo wrote:Question:
Why does it only return one line of text? For example:
/usr/lib64/nagios/plugins/check_nrpe -H myhost -c sar
Average: proc/s
A script for each command is needed to call the comand and convert the multi-line output from each command into a single line for NRPE to return to the Nagios/Cacti host. Sarparse etc. can then reformat the output to Cacti's taste if needed.
In my own situation, I'm getting the following from sarparse.pl:
Line 58 beingUse of uninitialized value in concatenation (.) or string at sarparse.pl line 58.
Code: Select all
$output=$output."$fields[$x]:$vals[$x] ";
-
- Posts: 7
- Joined: Thu Jun 19, 2008 12:45 pm
- Location: http://www.alternatepropulsion.com
I am having an issue getting data into the graphs.
I've added an 'Open Sockets' graph to one of my hosts, however there is no data being graphed.
I have checked the perl script and I already use nagios/nrpe. It runs fine from the command line.
I also use the realtime plugin for cacti. Strangely, your graphs work within this plugin but not within the normal graphs section.
How can I troubleshoot this?
-Dave
I've added an 'Open Sockets' graph to one of my hosts, however there is no data being graphed.
I have checked the perl script and I already use nagios/nrpe. It runs fine from the command line.
I also use the realtime plugin for cacti. Strangely, your graphs work within this plugin but not within the normal graphs section.
How can I troubleshoot this?
-Dave
-
- Posts: 7
- Joined: Thu Jun 19, 2008 12:45 pm
- Location: http://www.alternatepropulsion.com
You're right piccili. the script for IOparse was not included.
Hope ahhdem could update his post.
BTW I created a guide for installing this sar scripts on a SuSe Enterprise server 10.
It teaches you how to install cacti, nagios plugins and nrpe from your host to the remote server.
Perhaps some of you might be insterested.
Hope ahhdem could update his post.
BTW I created a guide for installing this sar scripts on a SuSe Enterprise server 10.
It teaches you how to install cacti, nagios plugins and nrpe from your host to the remote server.
Perhaps some of you might be insterested.
- Attachments
-
- CactiInstallations.doc
- (74 KiB) Downloaded 2722 times
Who is online
Users browsing this forum: No registered users and 1 guest