Monitoring smartctl (smartmontools)
Moderators: Developers, Moderators
Monitoring smartctl (smartmontools)
Hellow,
I wrote a script to monitor some interesting values from the smartctl (smartmontools) output.
Please let me know if you find this usefull.
A detailed description will follow.
This is my first cacti graph. Feel free to give an advice.
INSTALL:
put this into /etc/sudoers:
CATI_USER_THE_POLLER_RUNS ALL= NOPASSWD: /usr/sbin/smartctl
1. copy the script smartcheck.php to smart/usr/share/cacti/site/scripts
(or whereiever your cacti installation path points to)
2. Import the Templates
3. Make a new Graph - choose Unix - smartmon XXX
4. Enter the device you want to monitor
3. Let me know if there are problems or if you where successfull
BTW: I was on the german "systems" (an exhibition) and found the cacti stand. Thanks for the inspiration!
I wrote a script to monitor some interesting values from the smartctl (smartmontools) output.
Please let me know if you find this usefull.
A detailed description will follow.
This is my first cacti graph. Feel free to give an advice.
INSTALL:
put this into /etc/sudoers:
CATI_USER_THE_POLLER_RUNS ALL= NOPASSWD: /usr/sbin/smartctl
1. copy the script smartcheck.php to smart/usr/share/cacti/site/scripts
(or whereiever your cacti installation path points to)
2. Import the Templates
3. Make a new Graph - choose Unix - smartmon XXX
4. Enter the device you want to monitor
3. Let me know if there are problems or if you where successfull
BTW: I was on the german "systems" (an exhibition) and found the cacti stand. Thanks for the inspiration!
- Attachments
-
- graph_image.php.png (24.58 KiB) Viewed 13450 times
-
- cacti_smartmon.tar.gz
- (4.52 KiB) Downloaded 843 times
-
- graph_image2.php.png (20.13 KiB) Viewed 13451 times
-
- graph_image3.php.png (22.45 KiB) Viewed 13451 times
Yea, only on localhost, no SNMP here since its just a php script parsing the output of smartctl.
Of note, the templates were a bit wacky at first, I had to insert some hard returns into them to make them sort nicely.
Also, for some reason my HDD returns the value "Power_On_Minuets" rather than "Power_On_hours" that this script/template expects.
Run smartmonctl -a <device> yourself to check. If that is the case then just open the script and temples and change every occurrence of "Hours" to "Minutes".
And for whatever reason my drive does not return a "Raw_Rear_Error_rate" value, so I just hacked that out of the graph template.
Woot for my first attempt at modifying graphs/templates/scripts!
Of note, the templates were a bit wacky at first, I had to insert some hard returns into them to make them sort nicely.
Also, for some reason my HDD returns the value "Power_On_Minuets" rather than "Power_On_hours" that this script/template expects.
Run smartmonctl -a <device> yourself to check. If that is the case then just open the script and temples and change every occurrence of "Hours" to "Minutes".
And for whatever reason my drive does not return a "Raw_Rear_Error_rate" value, so I just hacked that out of the graph template.
Woot for my first attempt at modifying graphs/templates/scripts!
Wow, thanks for the quick reply!
Wow, thanks for the quick reply!
Thats exaktly what i have forgotten.
You can also get trouble because not all harddisk vendors support the same Smart attributes.
BTW: if a Smartvalue in the Script is not detected, it will be defaulted to Zero.
So you have to check manually "smartctl --all" if your harddisk supports all values.
But i think the script can be easily adjusted.
BW
Thats exaktly what i have forgotten.
What did you sort exactly? The output of the Script? Can you send me the updated version?Of note, the templates were a bit wacky at first, I had to insert some hard returns into them to make them sort nicely.
I have a disc that returns Power On Minutes too. But the most other return Power On hours.Also, for some reason my HDD returns the value "Power_On_Minuets" rather than "Power_On_hours" that this script/template expects.
You can also get trouble because not all harddisk vendors support the same Smart attributes.
BTW: if a Smartvalue in the Script is not detected, it will be defaulted to Zero.
So you have to check manually "smartctl --all" if your harddisk supports all values.
But i think the script can be easily adjusted.
BW
I just modified the graph template a little bit after taking out the value that my drive did not report. For some reason all the values were jumbled together on the graph. I just added hard returns (<HR>) at the end of each one and they sorted properly.What did you sort exactly? The output of the Script? Can you send me the updated version?
I attached the modified template I use now.
- Attachments
-
- smartmon_errors_modified.xml
- (12.27 KiB) Downloaded 473 times
Hi!
I run the script from the command line, all the SMART parameters are displayed correctly:
But on the graph all options are to zero and in the log file all the values are zero:
What do I wrong? Please help
I run the script from the command line, all the SMART parameters are displayed correctly:
Code: Select all
$ opt/local/share/cacti/scripts/smartcheck.php /dev/disk0
Temperature_Celsius:47 Reallocated_Sector_Ct:0 Power_On_Hours:3465 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
Code: Select all
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (385, '', '2010-01-15 10:35:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[385] Graphs['Unix - smartmon Errors '] CMD: php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (384, '', '2010-01-15 10:35:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/15/2010 10:35:13 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[384] Graphs['Unix - smartmon Errors '] CMD: php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
Regards, Michail A.
As you see from the logs, the program isn't properly running and retrieving the data. Verify the permissions are correct.
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
Thanx for reply!BSOD2600 wrote:As you see from the logs, the program isn't properly running and retrieving the data. Verify the permissions are correct.
I have a few other PHP scripts and they are working properly. For example this:
Code: Select all
01/15/2010 02:40:01 PM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[311] Graphs['APC Back-UPS 650 CS Line Statistics'] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/query_apcupsd.php 127.0.0.1 3551, output: LINEV:224.0 LOADPCT:30.0 BCHARGE:100.0 TIMELEFT:18.8 LOTRANS:196.0 HITRANS:256.0 ITEMP:29.2 BATTV:13.5 LINEFREQ:50.0 TONBATT:0 NOMINV:230 NOMBATTV:12.0
01/15/2010 02:40:01 PM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[310] Graphs['APC BackUPS 650 CS Battery Statistics'] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/query_apcupsd.php 127.0.0.1 3551, output: LINEV:224.0 LOADPCT:30.0 BCHARGE:100.0 TIMELEFT:18.8 LOTRANS:196.0 HITRANS:256.0 ITEMP:29.2 BATTV:13.5 LINEFREQ:50.0 TONBATT:0 NOMINV:230 NOMBATTV:12.0
The permissions are exactly the same as that smartcheck.php
And on the recommendation, which is in the script-
Code: Select all
* Don't forget to include cacti user in sudo!
* include this into the /etc/sudoers file!!
* For example:
*cactiuser ALL= NOPASSWD: /usr/local/sbin/smartctl
Code: Select all
mihailartuhov ALL= NOPASSWD: /usr/local/sbin/smartctl
Code: Select all
/* Default database settings*/
$database_type = "mysql";
$database_default = "cactidb";
$database_hostname = "localhost";
$database_username = "mihailartuhov";
$database_password = "amksoft";
$database_port = "3306";
Regards, Michail A.
Find in the cacti.log file with debugging enabled, when the smartctr script actually runs. What is the result / output?
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
I found (in devel mode)BSOD2600 wrote:Find in the cacti.log file with debugging enabled, when the smartctr script actually runs. What is the result / output?
Code: Select all
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select data_template_rrd.data_source_name, data_input_fields.data_name from (data_template_rrd,data_input_fields) where data_template_rrd.data_input_field_id=data_input_fields.id and data_template_rrd.local_data_id=384"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select data_template_rrd.data_source_name, data_input_fields.data_name from (data_template_rrd,data_input_fields) where data_template_rrd.data_input_field_id=data_input_fields.id and data_template_rrd.local_data_id=385"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Assoc: "select poller_id,end_time from poller_time where poller_id=0"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (385, '', '2010-01-16 10:05:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[385] Graphs['Unix - smartmon Errors '] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (384, '', '2010-01-16 10:05:01', 'Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0')"
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[384] Graphs['Unix - smartmon Errors '] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
http://pastie.org/780394
Thanx!
Regards, Michail A.
Code: Select all
01/16/2010 10:05:12 AM - CMDPHP: Poller[0] Host[1] Description[Localhost (iMac24)] DS[385] Graphs['Unix - smartmon Errors '] CMD: /opt/local/bin/php /opt/local/share/cacti/scripts/smartcheck.php /dev/disk0, output: Temperature_Celsius:0 Reallocated_Sector_Ct:0 Power_On_Hours:0 Spin_Retry_Count:0 Seek_Error_Rate:0 Raw_Read_Error_Rate:0
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
Who is online
Users browsing this forum: No registered users and 0 guests