AIX Monitoring Template via NMON

Addons for Cacti and discussion about those addons

Moderators: Developers, Moderators

smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

AIX Monitoring Template via NMON

Post by smaugs »

Hi,

I've created a template for monitoring AIX Servers via NMON. It monitors many different aspects of an AIX Server including:
  • CPU (cumulative and per CPU statistics)
  • Memory (distribution and stats)
  • Disk (Busy [%] and Read/Write Rate)
  • IO Adapter Statistics (Read/Write Rate)
  • JFS Statistics (Used Inode and Filespace)
  • Net Statistics (Traffic In/Out)
  • Page Statistics (Page In/Out/Faults)
Features:
  • The existing interfaces/devices are read out dynamically via data-queries.
  • The results are cached locally on the nagios server. Graphing more than one item does not result in multiple ssh-connects to the AIX Box.
  • The NMON Binary is choosed automatically depending on the OS Version it runs on.
  • There are 2 ways for getting the data via nmon:
    the plugin connects via ssh and run nmon every time (resource consuming)
    run nmon via crontab once per day and write results in a file. The Plugin gets some parts of this file for creating the graphs.
ToDos:
  • There are variables for nearly all parameters, but some parameters have to be specified manually (replaced) in the script(s).
  • It would be good to have a central configfile and not to specify the same things in every .pl file.
Example Graphs:
disk write.png
disk write.png (25.08 KiB) Viewed 12137 times
jfs.png
jfs.png (20.03 KiB) Viewed 12137 times
memory distribution.png
memory distribution.png (22 KiB) Viewed 12137 times
net.png
net.png (26.47 KiB) Viewed 12137 times
Install the plugin:
  1. copy the scripts and resources in their corresponding directories
  2. Generate a public/private keypair for your cacti poller user and save it somewhere. There are many tutorials on the net and also in this forum how to do it. Test if the cacti-poller-user can login the the aix-boxes without needing a password. Also check that the user on the aix box can read the nmon-file and/or execute the nmon binary.
  3. adjust the parameters in the .pl files.
    - nmondir: the directory where the nmon-binaries are stored on the aix-box. The plugin will take care of executing the right one for your system.
    - cachedir: the directory (on cacti server) where the data should be cached
    - ssh_idfile: the ssh identity file for connecting to the aix servers
    - nmon_runmode: in some scripts you can find this variable. you can specify here how the data is gathered from aix server (via crontab or directly executed). If the variable does not exist, simply create it. The default is to run as "daemon". The opposite is to run as "user".
  4. adjust the ssh-user in query_nmon.pm
    I have not yet created a variable for specifying the user in the .pl files. You have to modify manually. Search for "sshcommand" - it should be easy to find and replace the "nagprobe" user. Be careful - there are more than one occurence of the user in this file.
  5. adjust the nmon tmpdir in query_nmon.pm
    There is also no variable for the nmon_tmpdir - you also have to replace it manually. Its the directory where the nmon-files are placed when running via cron.
  6. adjust use in .pl files
    You have to adjust the line

    Code: Select all

    use lib "/srv/www/htdocs/cacti/scripts";
    to point to the correct directory
  7. test the script(s) manually on command line:

    Code: Select all

    sudo -u [user of cacti poller] /usr/bin/perl /srv/www/htdocs/cacti/scripts/query_nmon_netstats.pl <servername> index
    and it should return something. If not you can enable debugging via setting the variable $debug in the perl script to something greater 1.
  8. Import Template
I hope I did not forget anything. Please let me know if you like the plugin or if you have any problems with it. Feature requests are also appreciated.
Attachments
AIX-NMON-0.1.zip
(33.19 KiB) Downloaded 1166 times
ntenzpunishment
Posts: 14
Joined: Tue Jun 30, 2009 4:00 am

Re: AIX Monitoring Template via NMON

Post by ntenzpunishment »

That sure looks good - i will try it out - thanks alot
zeol
Posts: 10
Joined: Mon Apr 02, 2012 4:42 am

Re: AIX Monitoring Template via NMON

Post by zeol »

Hi

Man, You are great :)
This is thing I've been looking for a long long time. In fact I even tried to create something similar by myself, but with no luck.

Although I had some problems with deploying this set of scripts / templates, but it looks that I've got it working by now.
One thing - I couldn't manage to run in 'user' mode.. scripts always expect a file in /tmp/nmon/ which will contain previously collected data.

If I run nmon from crontab and collect data to $nmon_dir, then it works fine

Once again - thank You VERY much.
smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

Re: AIX Monitoring Template via NMON

Post by smaugs »

Hi,

I'm glad that someone is using it :)

If you run it in "user" mode, you have to expect a fairly high CPU usage on the AIX boxes - that's why I added the "daemon" mode. If that's no problem for you, please send me the output in DEBUG mode.

best regards,

Flo
henrikwils
Posts: 13
Joined: Wed May 16, 2012 5:35 am

Re: AIX Monitoring Template via NMON

Post by henrikwils »

Can you please provide details on how to set this up on deamon mode?

You mentioned that you need to run nmon once per day through crontab, but which details should I put into crontab, and do I need to do this on the Cacti-server or each AIX? As far as I can see, nmon isn't executed from the script in that mode.

Also, how do you get historical information (e.g. CPU usage throughout a day) if it's only run once a day? Does it run in the background recording everything for a period of time? (sorry, I'm not too familar with nmon yet).

What's the difference between the query_nmon_* and the nmon_* scripts?

I also noticed the problem when no files exists in /tmp/nmon. The script obviously expect proper information in a file there. What should I put in it, for the script to work at least until it has collected it's own information?

Thanks a lot for your hard work on this. It looks really good! :)
erik.costa
Posts: 12
Joined: Mon Feb 07, 2011 9:01 am

Re: AIX Monitoring Template via NMON

Post by erik.costa »

I try to import the template and get this error:

"Error: XML: Hash version does not exist."

Running Cacti Version 0.8.7g .. what is the version of your cacti?

Thanks

PS. If I try to run the command

"sudo -u [user of cacti poller] /usr/bin/perl /srv/www/htdocs/cacti/scripts/query_nmon_netstats.pl <servername> index"

I have the same error of Zeol user .. always try to read the /tmp/nmon
smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

Re: AIX Monitoring Template via NMON

Post by smaugs »

Hi folks,

I've completely rewritten the whole bunch and will post the new version soon.

There are some performance improvements and I've reduced the whole bunch of scripts to one (and one perl module). I also removed the so called "user" running mode.

regarding the question with the daemon: the nmon daemon is running for a specified amount of time when running in daemon mode. I can't tell you exactly how it's been implemented on our side, because it's done by someone else. This is how nmon ist started on our servers:

/usr/bin/topas_nmon -F /tmp/nmon/nmon-servername-20120530.nmon -T -r servername -s120 -c717 -youtput_dir=/tmp/nmon/nmon-servername-20120530.nmon -ystart_time=00:01:07,May30,2012

I'm sure you can do something similar on your systems. In our case it's started by some perl script. Unfortunately I can't provide it to you, since it's not written by me. All you have to do is to start nmon for a specified amount of time (or amount of measures), it was also discussed at work to start it indefinitely long (or a very long period), but then you have to take care of the single nmon file which gets bigger and bigger.

best regards,

Florian
henrikwils
Posts: 13
Joined: Wed May 16, 2012 5:35 am

Re: AIX Monitoring Template via NMON

Post by henrikwils »

We are waiting eagerly :)
smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

Re: AIX Monitoring Template via NMON

Post by smaugs »

sorry for the delay,

here is the actual version. As written in a previous post I have rewritten the whole thing and reduced it to 3 files.

mod_query_nmon.pm - Perl Module with all needed functions
parse_nmon_cacti.sh - shell script which is executed on the aix boxes for gathering nmon-data
query_nmon.pl - perl script which is called from cacti for gathering values

The cache files are now stored in perl Storable Format, not Plain-Text anymore for performance reasons. Since its now written more dynamic, it should be pretty easy to monitor new NMON attributes.

If you are upgrading from the prevous version, you have to modify all your data-input-methods to reflect the changes (Or simply delete all previous sources, templates, ... and reimport the attached XML).

In every case you have to replace the files in the script_queries directory, the also have changed a bit.

If you want to avoid any gaps in your graphs I would strongly recommend to test the changes on a seperate cacti instance and then change the data-input-methods one-by-one manually to see if it works.

the query_nmon.pl comes now with a DUMPINFO parameter, which simply prints out every data gathered in a perl Data::Dumper Structure for easy reading. It's pretty helpful for debugging. Something you have to know: The nmon format is somehow pretty complex. For that reason I've implemented different nmon-line-types in the perl module: Simple and Complex types. Simple types are types like Memory usage. Complex types are types like net-statistics.

Why: In cacti there is a difference between normal data-input-methods and data-queries. data-input-methods are gathering the same type of data, everytime without any changes - perfect for memory.
data-queries are for dynamic attributes, for devices that are not the same on every server like disks, network interfaces and so on. They are different from server to server.

So for simple types I only have to use:

e.g. query_nmon.pl <hostname> MEMORY

and get back the typical cacti output for all parameters of memory.

for e.g. disks cacti have to query which disks are there, what is their name and so on. therefore you have to use a different syntax:

e.g. query_nmon.pl <hostname> DISKS index

I've written a little syntax description for query_nmon.pl:
axfme01@sax82142:/srv/www/vhosts/cacti/scripts> ./query_nmon.pl
for displaying all available information: query_nmon.pl <servername or IP> DUMPINFO

Usage for simple types: query_nmon.pl <servername or IP> <category>
Usage for complex types: query_nmon.pl <servername or IP> <category> get <attribute1> <attribute2>
Usage for complex types: query_nmon.pl <servername or IP> <category> query <attribute1>
Usage for complex types: query_nmon.pl <servername or IP> <category> <index|num_indexes>
if you have any question, please ask :D
Attachments
nmon_v2.zip
(31.73 KiB) Downloaded 572 times
henrikwils
Posts: 13
Joined: Wed May 16, 2012 5:35 am

Re: AIX Monitoring Template via NMON

Post by henrikwils »

Thank you VERY MUCH! :D This is great work!

I got it working after some fiddeling. Something was working against me, and I think it was a combination of old files still residing on the Cacti server and old settings in Cacti. I removed it all and started all over, and now it's working. I got the simple graphs working right away, but I have some trouble figuring out the advanced ones.

If I type query_mon.pl myservername IOADAPT get 1 (or another number), it returns nothing. If i type "query" instead, it fails with "Use of unitnitialized value". "index" and "numindexes" works fine, but I cannot retrieve the actual data. Am I doing something wrong? I can see the data with DUMPINFO and assume that the array number is the number I should use?

Also, how do I get them graphed in Cacti? Can I load them in easily like a template, or do I have to specify them manually for each device?
smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

Re: AIX Monitoring Template via NMON

Post by smaugs »

Hi,

can you post the output of dumpinfo ?

The advanced types can be handled like you said manually or via the so called Data Queries.

The Data Queries are defined as XML Files. In this XML File you specify the script which has to be called and how it gets the available e.g. device names. In the end cacti queries the available e.g. devices that can be monitored via this XML query. all you have to do then is to add the appropriate graphs to your host (via Create Graphs for this host).

In my attached file there are already XML's for

CPU usage (per CPU),
Disk IO,
IO Adapter stats,
Filesystem Stats,
...

you can have a look at the config and create your own one.

best regards,

Flo
henrikwils
Posts: 13
Joined: Wed May 16, 2012 5:35 am

Re: AIX Monitoring Template via NMON

Post by henrikwils »

Output of DUMPINFO is attached.
Attachments
nmonoutput2.txt
(65.9 KiB) Downloaded 516 times
Last edited by henrikwils on Fri Jun 22, 2012 3:54 am, edited 1 time in total.
henrikwils
Posts: 13
Joined: Wed May 16, 2012 5:35 am

Re: AIX Monitoring Template via NMON

Post by henrikwils »

On our AIX serveres, we are running multiple instances of a service - Each one running under it's own user. Would it be possible to gather the CPU usage occupied by each user?
smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

Re: AIX Monitoring Template via NMON

Post by smaugs »

Hi,

here are 2 examples of using query_nmon with the advanced types:

Code: Select all

sax82142:~ # sudo -u cacti /usr/bin/perl /srv/www/vhosts/cacti/scripts/query_nmon.pl 10.72.16.43 IOADAPT query xfer-tps
0:0.0
1:2.2
sax82142:~ # sudo -u cacti /usr/bin/perl /srv/www/vhosts/cacti/scripts/query_nmon.pl 10.72.16.43 IOADAPT get xfer-tps 0
0.0
should work that way ...
smaugs
Posts: 10
Joined: Mon Jan 09, 2012 3:07 am

Re: AIX Monitoring Template via NMON

Post by smaugs »

henrikwils wrote:On our AIX serveres, we are running multiple instances of a service - Each one running under it's own user. Would it be possible to gather the CPU usage occupied by each user?
afaik it's not possible with NMON since this type of data is not gathered via the nmon-service. But it's definitely possible with other tools. One example would be to use ps and monitor the CPU-time used by the processes in question.

tell me if you need more info :)

ciao, flo
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests