Thold 2.x - Thresholding/Alerting module for cacti 8.6

cigamit · Post by **cigamit** » Fri Feb 10, 2006 9:56 am

Diggit2001 wrote:Hmmm. This looks interesting. I wish I would have found this earlier.

Can I install this architecture overtop of my existing (old) thold or should I attempt to remove thold first?

Thanks!

You definitely have to remove the old one first, or there could be issues with duplicate functions. Also, remove the old cron job as its not needed anymore.

torstentfk · Post by **torstentfk** » Tue Feb 21, 2006 11:50 am

Hello,

last week a switch rebooted and we did not recognize it.
So I changed the thold functions (here is allready an email-function included) so that cacti checks the uptime, stores it in the host-db and sets a flag if the device has rebooted. This flag is cleared in the next poller run.

Alter the host-table:

Code: Select all

mysql 
 ALTER TABLE `host` ADD `uptime` VARCHAR(20)  NULL DEFAULT '0' , ADD `reboot` VARCHAR( 3 )  NULL DEFAULT '0';

in plugin/thold/check-thold.php insert this lines after line 92 and
before "foreach ($queryrows as $q_row) ":

Code: Select all

#-------reboot-monitor
$sql = "SELECT description, hostname, id, uptime FROM host where reboot ='1' AND uptime > 1";
## $result = db_fetch_assoc($sql) or die (mysql_error());
## changed to:
$result = db_fetch_assoc($sql);

foreach ($result as $item)
{
$msg= " Device ". $item["description"]." has rebooted an came up at ".date("h:i:s d-m-Y");
$subject = "Reboot of ".$item["hostname"] ;
$file_array =array();
 if ($deadnotify)
 {  thold_mail($global_alert_address, '', $subject, $msg, $file_array); 
    print("Reboot detected on " . $item["hostname"]."\n");
 }
}
#-------reboot-monitor

in cmd.php insert this lines:
Line 106 add:

Code: Select all

         $snmp_uptime = 0;

At line 259 near $new_host = false;
insert this right after it:
$last_host2 = $last_host;

Some line further insert quite after the "else" of this block

Code: Select all

            if (($item["snmp_version"] == 0) || (($item["snmp_community"] == "") && ($item["snmp_version"] != 3))) {
                                        cacti_log("Host[$host_id] DS[$data_source] ERROR: Invalid SNMP Data Source.  Please either delete it from the database, or correct it.", $print_data_to_stdout);
                                        $output = "U";
                         }else {

insert this block:

Code: Select all

                             if ($last_host2 != $current_host)
                             {
                                $last_host2 = $current_host;
                                $snmp_uptime = cacti_snmp_get($item["hostname"], $item["snmp_community"], ".1.3.6.1.2.1.1.3.0", $item["snmp_version"], $item["snmp_username"], $item["snmp_password"], $item["snmp_port"], $item["snmp_timeout"], SNMP_WEBUI);
  if ($snmp_uptime > 10 )
                                {
                                    $old_uptime= db_fetch_cell("SELECT uptime FROM host WHERE  id='" . $item["host_id"] . "'");
                                    cacti_log("Uptime of " . $item["hostname"] . ",ID:" . $item["host_id"] . " - old=$old_uptime cur
:$snmp_uptime",$print_data_to_stdout);
                                    if ( $old_uptime > $snmp_uptime )
                                      {  db_execute("update host set reboot='1' where id='" . $item["host_id"] . "'");
                                         cacti_log("REBOOT detected of device " . $item["hostname"] . ": Old:".$old_uptime." Current
:".$snmp_uptime ,$print_data_to_stdout);
                                      }
                                    else
                                     { db_execute("update host set reboot='0' where id='" . $item["host_id"] . "'");  }
                                    db_execute("update host set uptime=$snmp_uptime where id='" . $item["host_id"] . "'");
                                } 
                                else 
                                   {cacti_log("Could not fetch Uptime of " . $item["hostname"] . ",ID:" . $item["host_id"],$print_da
ta_to_stdout); }
}

Finally I changed host.php:
at line 784 I inserted host.uptime so that this look like

Code: Select all

 host.avg_time,
                   host.availability ,
                     host.uptime
                   from host
                   $sql_where
                   order by host.description

and at line 828 I inserted at the end of the <td>s:

Code: Select all

<td><?php print (round($host["uptime"]/8640000, 1));?>d</td>

Now cacti stores for each host the uptime and set the flag. Thold checks for this flag and send an email to the admins (only if dead host notification is enabled).

Hope this would not interfear with other scripts /php commands. I hope this mod could make it into the official releases.

Torsten

torstentfk · Post by **torstentfk** » Wed Feb 22, 2006 5:50 am

Hi,

last post changed:
The table changed to
ALTER TABLE `host` ADD `uptime` VARCHAR(20) NULL DEFAULT '0' , ADD `reboot` VARCHAR( 3 ) NULL DEFAULT '0';

the devisor of the uptime changed to 8640000.

Torsten

jay_kumar · Post by **jay_kumar** » Sun Mar 05, 2006 12:48 pm

after i installed thold plugin on my cacti-0.8.6h-1.2.el4.rf the webpage stopped working and running poller gave segmentation fault. what could be the problem??

espinojo · Post by **espinojo** » Mon Mar 06, 2006 1:47 pm

jay_kumar wrote:after i installed thold plugin on my cacti-0.8.6h-1.2.el4.rf the webpage stopped working and running poller gave segmentation fault. what could be the problem??

i had the same problem, but haven't found a complete solution yet. The problem is releted to the functions.php file. Just backup it and replace it with the original file (it was saved by install.sh as functions.php.bak)

You will get access to the tool again, but u will receive some error mesagges.

aggie · Post by **aggie** » Tue Mar 07, 2006 7:15 am

I've been using cacti for about a year, but haven't really had the time to look into expanding it.

Tried out Thold 0.2.1a on two existing cacti installations (0.86g and 0.86c) and it managed to break importing templates for both of them,

When I upgraded to 0.86h (as I thought well maybe templates need to be from the same version of cacti), it finally broke the cacti frontend (poller seemed to be okay). I managed to uninstall it using the script and removing the thold tables for the mysql database.

I've now installed the plugin architecture 0.9 and thold 0.2.7 from cactiusers.org and all is now well, and I've only lost about an hour of monitoring data!

Moderators - I know I'm new around here and don't want to cause trouble; but do you think that it would be good forum moderation for this thread to be locked. With a new one created with details of thold 0.2.7 and the plugin architecture method of install? (and a reference to this thread) As I know I for one wasn't prepared to read 34 pages of posts before installing.

cigamit · Post by **cigamit** » Tue Mar 07, 2006 10:04 am

aggie wrote:I've been using cacti for about a year, but haven't really had the time to look into expanding it.

Tried out Thold 0.2.1a on two existing cacti installations (0.86g and 0.86c) and it managed to break importing templates for both of them,

When I upgraded to 0.86h (as I thought well maybe templates need to be from the same version of cacti), it finally broke the cacti frontend (poller seemed to be okay). I managed to uninstall it using the script and removing the thold tables for the mysql database.

I've now installed the plugin architecture 0.9 and thold 0.2.7 from cactiusers.org and all is now well, and I've only lost about an hour of monitoring data!

Moderators - I know I'm new around here and don't want to cause trouble; but do you think that it would be good forum moderation for this thread to be locked. With a new one created with details of thold 0.2.7 and the plugin architecture method of install? (and a reference to this thread) As I know I for one wasn't prepared to read 34 pages of posts before installing.

I have been thinking of doing that for a bit, it would save me alot of time also. I will see about starting writing it.

anuraganuj · Post by **anuraganuj** » Wed Mar 08, 2006 3:08 am

i can send test mail sucessfully from threshold but when a host goes down or breaches threshold it says Trigger as YES but no mail is seen.pls help

torstentfk · Post by **torstentfk** » Wed Mar 08, 2006 6:57 am

Hello,

found a bug in my code for reboot-monitor at plugin/thold/check-thold.php :
(marked in code).
I changed
$result = db_fetch_assoc($sql) or die (mysql_error());
to
$result = db_fetch_assoc($sql);

Torsten

TomekN · Post by **TomekN** » Wed Mar 08, 2006 8:07 am

Plugin achitecture, thold, monitor and other plugins from http://cactiusers.org/
work out of the box with latest version of Cacti.

I don't know who Jimmy is but he does a great job.

cigamit · Post by **cigamit** » Wed Mar 08, 2006 9:04 am

anuraganuj wrote:i can send test mail sucessfully from threshold but when a host goes down or breaches threshold it says Trigger as YES but no mail is seen.pls help

Usually a problem with your mail settings. Check them and make sure you are able to receive the test emails properly.

anuraganuj · Post by **anuraganuj** » Fri Mar 10, 2006 11:57 pm

Hi cigamit,

I tested sending d test mails again and again but trigger mail is missing

joe_hznm · Post by **joe_hznm** » Sun Mar 12, 2006 9:16 pm

Could the threshold trigger sound alarm ?

Joe

tgk · Post by **tgk** » Fri Mar 17, 2006 9:26 am

I'm creating a threshold template for my Interface - Traffic in and out. I want to do baseline deviation up only. The instructions for ignoring baseline deviation down say "if not set, lower bound threshold will not be checked at all". I left that box empty and saved, but when I came back to it, it was filled with "0". I cannot remove it.

When I apply the threshold to one of my interfaces, it reports the baseline deviation down.

I can go to Management / Thresholds, edit the individual threshold, and delete the baseline deviation down in that screen.

It seems to be a bug. It's not possible to leave the baseline deviation box empty in the Threshold Template page.

saveus · Post by **saveus** » Fri Mar 17, 2006 3:01 pm

hi have a problem with thold

when i use "Auto-create thresholds" its add all graph on thold
but the "current" values are all time at "0"

when i start manually the "check-thold.php"
i have never value in "Cur. value:" :

------------------
RRA: 420 : proc
Ref. values count: 37
Ref. value (min): 36
Ref. value (max): 42
Cur. value:
Low bl thresh: 31
High bl thresh: 48
Check against baseline: FAIL: Below baseline threshold!
------------------
RRA: 360 : cpu_nice
Ref. values count: 26
Ref. value (min): 0
Ref. value (max): 0
Cur. value:
Low bl thresh: 0
High bl thresh: 0
Check against baseline: OK
------------------
RRA: 364 : load_15min
Ref. values count: 26
Ref. value (min): 1
Ref. value (max): 2
Cur. value:
Low bl thresh: 1
High bl thresh: 2
Check against baseline: FAIL: Below baseline threshold!
------------------

in the thold.log have no value too :

03-17-06.20:50:02 element: mail6 - Traffic - shaper0[traffic_in] alertstat: 0 graph_id: 329 thold_low: thold_hi: rra: 371 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:02 element: mail6 - Partition - /dev/sda1[hdd_free] alertstat: 0 graph_id: 330 thold_low: thold_hi: rra: 372 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:02 element: mail6 - Partition - /dev/sda1[hdd_used] alertstat: 0 graph_id: 330 thold_low: thold_hi: rra: 372 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:02 element: mail6 - Partition - /dev/sdb1[hdd_used] alertstat: 0 graph_id: 331 thold_low: thold_hi: rra: 373 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:04 element: mail6 - Partition - /dev/sdb1[hdd_free] alertstat: 0 graph_id: 331 thold_low: thold_hi: rra: 373 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sdb2[hdd_free] alertstat: 0 graph_id: 332 thold_low: thold_hi: rra: 374 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sdb2[hdd_used] alertstat: 0 graph_id: 332 thold_low: thold_hi: rra: 374 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sda3[hdd_free] alertstat: 0 graph_id: 333 thold_low: thold_hi: rra: 375 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sda3[hdd_used] alertstat: 0 graph_id: 333 thold_low: thold_hi: rra: 375 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - CPU Usage - System[cpu_system] alertstat: 0 graph_id: 324 thold_low: thold_hi: rra: 361 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: hdi - Processes[proc] alertstat: 0 graph_id: 365 thold_low: thold_hi: 30 rra: 420 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - CPU Usage - Nice[cpu_nice] alertstat: 0 graph_id: 324 thold_low: thold_hi: rra: 360 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - Load Average - 15 Minute[load_15min] alertstat: 0 graph_id: 325 thold_low: thold_hi: rra: 364 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - Load Average - 1 Minute[load_1min] alertstat: 0 graph_id: 325 thold_low: thold_hi: rra: 363 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - CPU Usage - User[cpu_user] alertstat: 0 graph_id: 324 thold_low: thold_hi: rra: 362 trigger: 1 triggerct: 0 current: logset:

someone have an idea for resolve that ?

thanks

Cacti

Thold 2.x - Thresholding/Alerting module for cacti 8.6

segmentation fault

Re: segmentation fault

Threshold not sending email

Re: Threshold not sending email

Tested again

Problem creating threshold template

problem with current value

Who is online