Monitoring smartctl (smartmontools)

Templates, scripts for templates, scripts and requests for templates.

Moderators: Developers, Moderators

Post Reply
llneoll
Posts: 2
Joined: Fri Oct 26, 2007 6:22 pm

Monitoring smartctl (smartmontools)

Post by llneoll »

Hellow,

I wrote a script to monitor some interesting values from the smartctl (smartmontools) output.

Please let me know if you find this usefull.

A detailed description will follow.

This is my first cacti graph. Feel free to give an advice.


INSTALL:
put this into /etc/sudoers:
CATI_USER_THE_POLLER_RUNS ALL= NOPASSWD: /usr/sbin/smartctl


1. copy the script smartcheck.php to smart/usr/share/cacti/site/scripts
(or whereiever your cacti installation path points to)

2. Import the Templates

3. Make a new Graph - choose Unix - smartmon XXX

4. Enter the device you want to monitor

3. Let me know if there are problems or if you where successfull

BTW: I was on the german "systems" (an exhibition) and found the cacti stand. Thanks for the inspiration!
Attachments
graph_image.php.png
graph_image.php.png (24.58 KiB) Viewed 13450 times
cacti_smartmon.tar.gz
(4.52 KiB) Downloaded 843 times
graph_image2.php.png
graph_image2.php.png (20.13 KiB) Viewed 13451 times
graph_image3.php.png
graph_image3.php.png (22.45 KiB) Viewed 13451 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Hey, a new user contibuting templates in about no time!
Thank you
Reinhard
kapa
Posts: 7
Joined: Tue Oct 23, 2007 3:25 am
Location: Moscow

Post by kapa »

It works only on localhost?
Chaosratt
Posts: 34
Joined: Sun Oct 28, 2007 3:31 am
Location: St. Pete, FL
Contact:

Post by Chaosratt »

Yea, only on localhost, no SNMP here since its just a php script parsing the output of smartctl.


Of note, the templates were a bit wacky at first, I had to insert some hard returns into them to make them sort nicely.
Also, for some reason my HDD returns the value "Power_On_Minuets" rather than "Power_On_hours" that this script/template expects.
Run smartmonctl -a <device> yourself to check. If that is the case then just open the script and temples and change every occurrence of "Hours" to "Minutes".
And for whatever reason my drive does not return a "Raw_Rear_Error_rate" value, so I just hacked that out of the graph template.


Woot for my first attempt at modifying graphs/templates/scripts!
llneoll
Posts: 2
Joined: Fri Oct 26, 2007 6:22 pm

Wow, thanks for the quick reply!

Post by llneoll »

Wow, thanks for the quick reply!
Thats exaktly what i have forgotten.
Of note, the templates were a bit wacky at first, I had to insert some hard returns into them to make them sort nicely.
What did you sort exactly? The output of the Script? Can you send me the updated version?

Also, for some reason my HDD returns the value "Power_On_Minuets" rather than "Power_On_hours" that this script/template expects.
I have a disc that returns Power On Minutes too. But the most other return Power On hours.
You can also get trouble because not all harddisk vendors support the same Smart attributes.

BTW: if a Smartvalue in the Script is not detected, it will be defaulted to Zero.
So you have to check manually "smartctl --all" if your harddisk supports all values.

But i think the script can be easily adjusted.

BW
Chaosratt
Posts: 34
Joined: Sun Oct 28, 2007 3:31 am
Location: St. Pete, FL
Contact:

Post by Chaosratt »

What did you sort exactly? The output of the Script? Can you send me the updated version?
I just modified the graph template a little bit after taking out the value that my drive did not report. For some reason all the values were jumbled together on the graph. I just added hard returns (<HR>) at the end of each one and they sorted properly.


I attached the modified template I use now.
Attachments
smartmon_errors_modified.xml
(12.27 KiB) Downloaded 473 times
tamias
Posts: 49
Joined: Thu Oct 16, 2008 7:12 am

Post by tamias »

Hi!
I run the script from the command line, all the SMART parameters are displayed correctly:

Code: Select all

$ opt/local/share/cacti/scripts/smartcheck.php /dev/disk0
Temperature_Celsius:47 Reallocated_Sector_Ct:0 Power_On_Hours:3465 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
But on the graph all options are to zero and in the log file all the values are zero:

Code: Select all


01/15/2010 10:35:13 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (385, '', '2010-01-15 10:35:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[385] Graphs['Unix - smartmon Errors '] CMD: php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (384, '', '2010-01-15 10:35:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[384] Graphs['Unix - smartmon Errors '] CMD: php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0

What do I wrong? Please help
Regards, Michail A.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

As you see from the logs, the program isn't properly running and retrieving the data. Verify the permissions are correct.
tamias
Posts: 49
Joined: Thu Oct 16, 2008 7:12 am

Post by tamias »

BSOD2600 wrote:As you see from the logs, the program isn't properly running and retrieving the data. Verify the permissions are correct.
Thanx for reply!
I have a few other PHP scripts and they are working properly. For example this:

Code: Select all

01/15/2010 02:40:01 PM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[311] Graphs['APC Back-UPS 650 CS Line Statistics'] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/query_apcupsd.php 127.0.0.1 3551, output: LINEV:224.0 LOADPCT:30.0 BCHARGE:100.0 TIMELEFT:18.8 LOTRANS:196.0 HITRANS:256.0 ITEMP:29.2 BATTV:13.5 LINEFREQ:50.0 TONBATT:0 NOMINV:230 NOMBATTV:12.0
01/15/2010 02:40:01 PM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[310] Graphs['APC BackUPS 650 CS Battery Statistics'] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/query_apcupsd.php 127.0.0.1 3551, output: LINEV:224.0 LOADPCT:30.0 BCHARGE:100.0 TIMELEFT:18.8 LOTRANS:196.0 HITRANS:256.0 ITEMP:29.2 BATTV:13.5 LINEFREQ:50.0 TONBATT:0 NOMINV:230 NOMBATTV:12.0

The permissions are exactly the same as that smartcheck.php
And on the recommendation, which is in the script-

Code: Select all

 * Don't forget to include cacti user in sudo!
 * include this into the /etc/sudoers file!!
 * For example: 
 *cactiuser  ALL= NOPASSWD: /usr/local/sbin/smartctl
I added this line to /etc/sudoers

Code: Select all

mihailartuhov  ALL= NOPASSWD: /usr/local/sbin/smartctl
It's my quote global.php:

Code: Select all

/* Default database settings*/
$database_type = "mysql";
$database_default = "cactidb";
$database_hostname = "localhost";
$database_username = "mihailartuhov";
$database_password = "amksoft";
$database_port = "3306";
Regards, Michail A.
tamias
Posts: 49
Joined: Thu Oct 16, 2008 7:12 am

Post by tamias »

BSOD2600 wrote:As you see from the logs, the program isn't properly
I do not see in the log any errors even in the debuge mode, except that all values are 0
Regards, Michail A.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

Find in the cacti.log file with debugging enabled, when the smartctr script actually runs. What is the result / output?
tamias
Posts: 49
Joined: Thu Oct 16, 2008 7:12 am

Post by tamias »

BSOD2600 wrote:Find in the cacti.log file with debugging enabled, when the smartctr script actually runs. What is the result / output?
I found (in devel mode)

Code: Select all

01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select data_template_rrd.data_source_name, data_input_fields.data_name from (data_template_rrd,data_input_fields) where data_template_rrd.data_input_field_id=data_input_fields.id and data_template_rrd.local_data_id=384"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select data_template_rrd.data_source_name, data_input_fields.data_name from (data_template_rrd,data_input_fields) where data_template_rrd.data_input_field_id=data_input_fields.id and data_template_rrd.local_data_id=385"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select poller_id,end_time from poller_time where poller_id=0"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (385, '', '2010-01-16 10:05:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[385] Graphs['Unix - smartmon Errors '] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (384, '', '2010-01-16 10:05:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[384] Graphs['Unix - smartmon Errors '] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0

Look at my log for one cycle in debuge mode
http://pastie.org/780394

Thanx!
Regards, Michail A.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Post by BSOD2600 »

Code: Select all

01/16/2010 10:05:12 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[385] Graphs['Unix - smartmon Errors '] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0 
There you have it. Script isn't returning values while run via the cacti poller.
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests