Thold 2.x - Thresholding/Alerting module for cacti 8.6

Support questions about the Threshold plugin

Moderators: Developers, Moderators

Locked
cigamit
Developer
Posts: 3367
Joined: Thu Apr 07, 2005 3:29 pm
Location: B/CS Texas
Contact:

Post by cigamit »

Diggit2001 wrote:Hmmm. This looks interesting. I wish I would have found this earlier. :P

Can I install this architecture overtop of my existing (old) thold or should I attempt to remove thold first?

Thanks!
You definitely have to remove the old one first, or there could be issues with duplicate functions. Also, remove the old cron job as its not needed anymore.
torstentfk
Cacti User
Posts: 367
Joined: Tue Apr 05, 2005 9:52 am
Location: Munich, Germany

Post by torstentfk »

Hello,

last week a switch rebooted and we did not recognize it.
So I changed the thold functions (here is allready an email-function included) so that cacti checks the uptime, stores it in the host-db and sets a flag if the device has rebooted. This flag is cleared in the next poller run.

Alter the host-table:

Code: Select all

mysql 
 ALTER TABLE `host` ADD `uptime` VARCHAR(20)  NULL DEFAULT '0' , ADD `reboot` VARCHAR( 3 )  NULL DEFAULT '0';
in plugin/thold/check-thold.php insert this lines after line 92 and
before "foreach ($queryrows as $q_row) ":

Code: Select all

#-------reboot-monitor
$sql = "SELECT description, hostname, id, uptime FROM host where reboot ='1' AND uptime > 1";
## $result = db_fetch_assoc($sql) or die (mysql_error());
## changed to:
$result = db_fetch_assoc($sql);

foreach ($result as $item)
{
$msg= " Device ". $item["description"]." has rebooted an came up at ".date("h:i:s d-m-Y");
$subject = "Reboot of ".$item["hostname"] ;
$file_array =array();
 if ($deadnotify)
 {  thold_mail($global_alert_address, '', $subject, $msg, $file_array); 
    print("Reboot detected on " . $item["hostname"]."\n");
 }
}
#-------reboot-monitor

in cmd.php insert this lines:
Line 106 add:

Code: Select all

         $snmp_uptime = 0;

At line 259 near $new_host = false;
insert this right after it:
$last_host2 = $last_host;

Some line further insert quite after the "else" of this block

Code: Select all

            if (($item["snmp_version"] == 0) || (($item["snmp_community"] == "") && ($item["snmp_version"] != 3))) {
                                        cacti_log("Host[$host_id] DS[$data_source] ERROR: Invalid SNMP Data Source.  Please either delete it from the database, or correct it.", $print_data_to_stdout);
                                        $output = "U";
                         }else {
insert this block:

Code: Select all

                             if ($last_host2 != $current_host)
                             {
                                $last_host2 = $current_host;
                                $snmp_uptime = cacti_snmp_get($item["hostname"], $item["snmp_community"], ".1.3.6.1.2.1.1.3.0", $item["snmp_version"], $item["snmp_username"], $item["snmp_password"], $item["snmp_port"], $item["snmp_timeout"], SNMP_WEBUI);
  if ($snmp_uptime > 10 )
                                {
                                    $old_uptime= db_fetch_cell("SELECT uptime FROM host WHERE  id='" . $item["host_id"] . "'");
                                    cacti_log("Uptime of " . $item["hostname"] . ",ID:" . $item["host_id"] . " - old=$old_uptime cur
:$snmp_uptime",$print_data_to_stdout);
                                    if ( $old_uptime > $snmp_uptime )
                                      {  db_execute("update host set reboot='1' where id='" . $item["host_id"] . "'");
                                         cacti_log("REBOOT detected of device " . $item["hostname"] . ": Old:".$old_uptime." Current
:".$snmp_uptime ,$print_data_to_stdout);
                                      }
                                    else
                                     { db_execute("update host set reboot='0' where id='" . $item["host_id"] . "'");  }
                                    db_execute("update host set uptime=$snmp_uptime where id='" . $item["host_id"] . "'");
                                } 
                                else 
                                   {cacti_log("Could not fetch Uptime of " . $item["hostname"] . ",ID:" . $item["host_id"],$print_da
ta_to_stdout); }
}                               

Finally I changed host.php:
at line 784 I inserted host.uptime so that this look like

Code: Select all

 host.avg_time,
                   host.availability ,
                     host.uptime
                   from host
                   $sql_where
                   order by host.description
and at line 828 I inserted at the end of the <td>s:

Code: Select all

<td><?php print (round($host["uptime"]/8640000, 1));?>d</td>
Now cacti stores for each host the uptime and set the flag. Thold checks for this flag and send an email to the admins (only if dead host notification is enabled).

Hope this would not interfear with other scripts /php commands. I hope this mod could make it into the official releases.

Torsten
Last edited by torstentfk on Wed Mar 08, 2006 6:55 am, edited 2 times in total.
torstentfk
Cacti User
Posts: 367
Joined: Tue Apr 05, 2005 9:52 am
Location: Munich, Germany

Post by torstentfk »

Hi,

last post changed:
The table changed to
ALTER TABLE `host` ADD `uptime` VARCHAR(20) NULL DEFAULT '0' , ADD `reboot` VARCHAR( 3 ) NULL DEFAULT '0';

the devisor of the uptime changed to 8640000.

Torsten
jay_kumar
Posts: 8
Joined: Thu Feb 02, 2006 2:41 pm

segmentation fault

Post by jay_kumar »

after i installed thold plugin on my cacti-0.8.6h-1.2.el4.rf the webpage stopped working and running poller gave segmentation fault. what could be the problem??
espinojo
Posts: 4
Joined: Tue Jan 17, 2006 12:24 am

Re: segmentation fault

Post by espinojo »

jay_kumar wrote:after i installed thold plugin on my cacti-0.8.6h-1.2.el4.rf the webpage stopped working and running poller gave segmentation fault. what could be the problem??
i had the same problem, but haven't found a complete solution yet. The problem is releted to the functions.php file. Just backup it and replace it with the original file (it was saved by install.sh as functions.php.bak)

You will get access to the tool again, but u will receive some error mesagges.
aggie
Posts: 8
Joined: Tue Mar 07, 2006 6:51 am

Post by aggie »

I've been using cacti for about a year, but haven't really had the time to look into expanding it.

Tried out Thold 0.2.1a on two existing cacti installations (0.86g and 0.86c) and it managed to break importing templates for both of them,

When I upgraded to 0.86h (as I thought well maybe templates need to be from the same version of cacti), it finally broke the cacti frontend (poller seemed to be okay). I managed to uninstall it using the script and removing the thold tables for the mysql database.

I've now installed the plugin architecture 0.9 and thold 0.2.7 from cactiusers.org and all is now well, and I've only lost about an hour of monitoring data!

Moderators - I know I'm new around here and don't want to cause trouble; but do you think that it would be good forum moderation for this thread to be locked. With a new one created with details of thold 0.2.7 and the plugin architecture method of install? (and a reference to this thread) As I know I for one wasn't prepared to read 34 pages of posts before installing.
cigamit
Developer
Posts: 3367
Joined: Thu Apr 07, 2005 3:29 pm
Location: B/CS Texas
Contact:

Post by cigamit »

aggie wrote:I've been using cacti for about a year, but haven't really had the time to look into expanding it.

Tried out Thold 0.2.1a on two existing cacti installations (0.86g and 0.86c) and it managed to break importing templates for both of them,

When I upgraded to 0.86h (as I thought well maybe templates need to be from the same version of cacti), it finally broke the cacti frontend (poller seemed to be okay). I managed to uninstall it using the script and removing the thold tables for the mysql database.

I've now installed the plugin architecture 0.9 and thold 0.2.7 from cactiusers.org and all is now well, and I've only lost about an hour of monitoring data!

Moderators - I know I'm new around here and don't want to cause trouble; but do you think that it would be good forum moderation for this thread to be locked. With a new one created with details of thold 0.2.7 and the plugin architecture method of install? (and a reference to this thread) As I know I for one wasn't prepared to read 34 pages of posts before installing.
I have been thinking of doing that for a bit, it would save me alot of time also. I will see about starting writing it.
anuraganuj
Cacti User
Posts: 70
Joined: Tue Feb 21, 2006 9:50 am

Threshold not sending email

Post by anuraganuj »

i can send test mail sucessfully from threshold but when a host goes down or breaches threshold it says Trigger as YES but no mail is seen.pls help
torstentfk
Cacti User
Posts: 367
Joined: Tue Apr 05, 2005 9:52 am
Location: Munich, Germany

Post by torstentfk »

Hello,

found a bug in my code for reboot-monitor at plugin/thold/check-thold.php :
(marked in code).
I changed
$result = db_fetch_assoc($sql) or die (mysql_error());
to
$result = db_fetch_assoc($sql);


Torsten
TomekN
Posts: 13
Joined: Thu Mar 02, 2006 7:15 am
Location: Warsaw, Poland

Post by TomekN »

Plugin achitecture, thold, monitor and other plugins from http://cactiusers.org/
work out of the box with latest version of Cacti.

I don't know who Jimmy is but he does a great job.
cigamit
Developer
Posts: 3367
Joined: Thu Apr 07, 2005 3:29 pm
Location: B/CS Texas
Contact:

Re: Threshold not sending email

Post by cigamit »

anuraganuj wrote:i can send test mail sucessfully from threshold but when a host goes down or breaches threshold it says Trigger as YES but no mail is seen.pls help
Usually a problem with your mail settings. Check them and make sure you are able to receive the test emails properly.
anuraganuj
Cacti User
Posts: 70
Joined: Tue Feb 21, 2006 9:50 am

Tested again

Post by anuraganuj »

Hi cigamit,

I tested sending d test mails again and again but trigger mail is missing :(
joe_hznm
Posts: 29
Joined: Thu May 19, 2005 12:29 am

Post by joe_hznm »

Could the threshold trigger sound alarm ?

Joe
tgk
Posts: 28
Joined: Sat Mar 11, 2006 10:46 pm

Problem creating threshold template

Post by tgk »

I'm creating a threshold template for my Interface - Traffic in and out. I want to do baseline deviation up only. The instructions for ignoring baseline deviation down say "if not set, lower bound threshold will not be checked at all". I left that box empty and saved, but when I came back to it, it was filled with "0". I cannot remove it.

When I apply the threshold to one of my interfaces, it reports the baseline deviation down.

I can go to Management / Thresholds, edit the individual threshold, and delete the baseline deviation down in that screen.

It seems to be a bug. It's not possible to leave the baseline deviation box empty in the Threshold Template page.
saveus
Posts: 25
Joined: Sun Jul 10, 2005 5:04 pm

problem with current value

Post by saveus »

hi have a problem with thold

when i use "Auto-create thresholds" its add all graph on thold
but the "current" values are all time at "0"

when i start manually the "check-thold.php"
i have never value in "Cur. value:" :

------------------
RRA: 420 : proc
Ref. values count: 37
Ref. value (min): 36
Ref. value (max): 42
Cur. value:
Low bl thresh: 31
High bl thresh: 48
Check against baseline: FAIL: Below baseline threshold!
------------------
RRA: 360 : cpu_nice
Ref. values count: 26
Ref. value (min): 0
Ref. value (max): 0
Cur. value:
Low bl thresh: 0
High bl thresh: 0
Check against baseline: OK
------------------
RRA: 364 : load_15min
Ref. values count: 26
Ref. value (min): 1
Ref. value (max): 2
Cur. value:
Low bl thresh: 1
High bl thresh: 2
Check against baseline: FAIL: Below baseline threshold!
------------------


in the thold.log have no value too :


03-17-06.20:50:02 element: mail6 - Traffic - shaper0[traffic_in] alertstat: 0 graph_id: 329 thold_low: thold_hi: rra: 371 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:02 element: mail6 - Partition - /dev/sda1[hdd_free] alertstat: 0 graph_id: 330 thold_low: thold_hi: rra: 372 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:02 element: mail6 - Partition - /dev/sda1[hdd_used] alertstat: 0 graph_id: 330 thold_low: thold_hi: rra: 372 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:02 element: mail6 - Partition - /dev/sdb1[hdd_used] alertstat: 0 graph_id: 331 thold_low: thold_hi: rra: 373 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:04 element: mail6 - Partition - /dev/sdb1[hdd_free] alertstat: 0 graph_id: 331 thold_low: thold_hi: rra: 373 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sdb2[hdd_free] alertstat: 0 graph_id: 332 thold_low: thold_hi: rra: 374 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sdb2[hdd_used] alertstat: 0 graph_id: 332 thold_low: thold_hi: rra: 374 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sda3[hdd_free] alertstat: 0 graph_id: 333 thold_low: thold_hi: rra: 375 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - Partition - /dev/sda3[hdd_used] alertstat: 0 graph_id: 333 thold_low: thold_hi: rra: 375 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:07 element: mail6 - CPU Usage - System[cpu_system] alertstat: 0 graph_id: 324 thold_low: thold_hi: rra: 361 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: hdi - Processes[proc] alertstat: 0 graph_id: 365 thold_low: thold_hi: 30 rra: 420 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - CPU Usage - Nice[cpu_nice] alertstat: 0 graph_id: 324 thold_low: thold_hi: rra: 360 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - Load Average - 15 Minute[load_15min] alertstat: 0 graph_id: 325 thold_low: thold_hi: rra: 364 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - Load Average - 1 Minute[load_1min] alertstat: 0 graph_id: 325 thold_low: thold_hi: rra: 363 trigger: 1 triggerct: 0 current: logset:
03-17-06.20:50:08 element: mail6 - CPU Usage - User[cpu_user] alertstat: 0 graph_id: 324 thold_low: thold_hi: rra: 362 trigger: 1 triggerct: 0 current: logset:


someone have an idea for resolve that ?

thanks
Locked

Who is online

Users browsing this forum: No registered users and 0 guests