THold 2.0 beta - Threshold monitoring plugin for cacti 8.6c

Support questions about the Threshold plugin

Moderators: Developers, Moderators

adesimone
Posts: 32
Joined: Mon Jan 24, 2005 12:46 am
Contact:

THold 2.0 beta - Threshold monitoring plugin for cacti 8.6c

Post by adesimone »

Finally.. version 2.0 for cacti 8.6 support ... This is still under development - any help or testing would be appreciated.

This version is almost entirely written in php & mysql.

DESCRIPTION: Simple threshold / alarm / event monitoring tool that sends emails & syslog events when configured value is reached. Only sends emails when threshold is reached and then descended (not each time it is polled). Trigger is configurable (only sends email after # of consecutive times that a threshold is reached). Emails include link to associated graph and embeded image.

Curently only works with gauge-based RRDs (i.e. cpu usage, mem usage, load average, etc) and not counter-based RRDs (i.e. bandwidth / errors, etc).

This has not been tested with cactiD yet - it may or may not work

Download thold2.0.zip and unzip somewhere. There is a README there (with very little info). Run the install.sh script.

I think it is mostly self-explanitory, but feel free to ask questions. This will only work with *NIX-based systems - windows support coming soon.

ADesimone
adesimone@ciscoconfigbuilder.com
http://www.ciscoconfigbuilder.com
Attachments
thold2.0.zip
(43.29 KiB) Downloaded 1820 times
shot1.jpg
shot1.jpg (202.49 KiB) Viewed 35132 times
shot2.jpg
shot2.jpg (161.45 KiB) Viewed 35132 times
Matt_B
Posts: 6
Joined: Tue Feb 08, 2005 1:22 pm

Post by Matt_B »

First: Thank you for your hard work :)

I have the following errors once I installed the script.

1: Unknown column 'alertstat' in 'order clause' when i click on the threshld tab.

2. I can not seem to get the email to set. I always goes back to root@localhost.

I just installed this and have not spent much time with it it, but i will play with it and trying to leave better feed back.

Thank you again
-Matt
www.mattsshack.com
mdw162
Posts: 23
Joined: Fri Feb 04, 2005 7:49 pm

Post by mdw162 »

I actually have a different issue. The script can't seem to find the "$rra" variable for some reason. When I go to the "Data Sources" page and click on the respective link for the threshold settings I get

Notice: Undefined variable: rra in /var/www/cacti/thold.php on line 35

Notice: Undefined variable: rra in /var/www/cacti/thold.php on line 59

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /var/www/cacti/thold.php on line 59

Warning: mysql_fetch_array(): supplied argument is not a valid MySQL result resource in /var/www/cacti/thold.php on line 61



Any thoughts?
mdw162
Posts: 23
Joined: Fri Feb 04, 2005 7:49 pm

Post by mdw162 »

Okay, I don't know much about PHP but I'm guessing they made changes to how variables are passed. I have version 4.3.10 and instead of passing with a simple '$var' you have to use the $_GET array. I added these two lines to the top of thold.php and that fixed the first part of it:

$rra=$_GET['rra'];
$desc=$_GET['desc'];


I then get a similar error when I submit the info. Let me see what I can find...
mdw162
Posts: 23
Joined: Fri Feb 04, 2005 7:49 pm

Post by mdw162 »

Okay, more info...I think the issue was that I had "Register_Globals" turned off. I think that's the default for security reasons.
User avatar
rony
Developer/Forum Admin
Posts: 6022
Joined: Mon Nov 17, 2003 6:35 pm
Location: Michigan, USA
Contact:

Post by rony »

That is the default in later releases and a suggested practice.

Cacti does not depend on Register Globals any longer. I would suggest the same for any patched/addons.
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
adesimone
Posts: 32
Joined: Mon Jan 24, 2005 12:46 am
Contact:

Post by adesimone »

you are correct -- I didn't realize I had regiser globals turned off - I will fix the sloppy code and post and update tonight.
adesimone
Posts: 32
Joined: Mon Jan 24, 2005 12:46 am
Contact:

Post by adesimone »

I meant ON..
mdw162
Posts: 23
Joined: Fri Feb 04, 2005 7:49 pm

Post by mdw162 »

Quick clarification...

In the screenshot above in this thread, you show a "red," "triggered" state for the following:

threshold=10
trigger=3
current=7

I thought the trigger value represented the number of times the threshold is exceeded before an alert is sent. In the above scenario, with the cuurent value of only 7 (below the threshold), why is an alert sent?
adesimone
Posts: 32
Joined: Mon Jan 24, 2005 12:46 am
Contact:

Post by adesimone »

yes, you are correct about the trigger.

the 'currently triggered' is set during the poll, not dynamically during page load - I adjusted the threshold after triggering it just to show the difference - not to confuse...

so in short - the threshold was adjusted but the poll did not happen yet, so that is why you see it triggered.

adesimone
mdw162
Posts: 23
Joined: Fri Feb 04, 2005 7:49 pm

Post by mdw162 »

Thanks for the clarification on the refresh issue.

Okay, I'm trying not to make this sound like a list of complaints -- I really think this project has potential, so please don't take this that way...if we (you) get it working it will rock!

That being said, I've noticed a few more things:

1. check-thold.php is called from within cron and if the cron user's home directory is not the cacti directory the first "include" statement will fail because it uses relative paths. I'm not sure of the best way to handle that.

2. The $cactibasedir variable in check-thold.php is hardcoded as "/var/www/html/cacti." It should be pulled from config.php.

3. In check-thold.php the line "shell_exec("mv $logfile $newlogfile");" moves the original log file "cacti.log," effectively deleting it. I think that's causing problems. First, Cacti uses that log under the "Utilities" page. Second, I think it causes anomolies in the threshold reporting. If a threshold is really reached, an email is sent and the trigger shows red, as it should. But if the value hasn't crossed the threshold, even after allowing the poller to run for an hour, it still shows red but just doesn't send an email. Plus, if you refresh the threshold page during the few seconds where "cacti.log" exists, the "current" value shows up as 0 with the trigger green.

I know it's almost impossible to troubleshoot given the few details I sent so I'll try to narrow it down some more.

Thanks again!
adesimone
Posts: 32
Joined: Mon Jan 24, 2005 12:46 am
Contact:

Post by adesimone »

OK... version 2.0b - use the 'thold2.0b-update.zip' if you have already installed 2.0 otherwise download 'thold2.0b.zip' for a clean install

This version no longer requires global_register=on dependency

as far as the issues -
1. I don't think this is an issue; I run this as root and it works fine - also, poller.php handles this the same way

2. I adjusted this as requested - it should not have been an issue because the install script was adjusting this accordingly - but I agree the config.php is a better approach

3. Currently, check-thold.php gets it's 'current values' from the cacti.log - that is why moving the log file / deleting it is essential. If you have noticed, the install script changes your logging level to debug in order to grab this output; so keeping a permanent version of the logfile would become very large. Yes, I know this is very lame. It is my intention to just grab the last/current values from the rrd file itself using rrdfetch, etc. I was unable to get that to work and I had so many requests for a 8.6c compatible version - so I pushed that to the backburner. If anyone can help with this, that would be great. It would also help alleviate the platform dependency. I'm not sure that moving/deleting the log affects anything; the output on the threshld tab is pulled from the database. I have seen a few instances where 'current' shows up as '0'; try installing the update to see if it clears up that issue.

Keep the bugs coming...

thanks
ADesimone
Attachments
thold2.0b-update.zip
update patch
(6.1 KiB) Downloaded 913 times
thold2.0b.zip
thold 2.0b full version
(43.86 KiB) Downloaded 1209 times
rpingar
Cacti User
Posts: 86
Joined: Mon Jun 07, 2004 8:17 am

Post by rpingar »

I don't have it installed yet, but i'd like to know if it is possible to be allerted if a device goes down using you add on.

thanks
Matt_B
Posts: 6
Joined: Tue Feb 08, 2005 1:22 pm

Post by Matt_B »

Ok I have this installed now. Sorry about the mysql error I posted before. I had the older version of threshold installed (older version of cacti too). I upgraded and when I tried to install this version of threshold the tables already existed. So they could not be created. I dropped them and reinserted the mysql tables and all is good now.

I believe I have everything working now. The only thing I wish I could do is send emails on temperatures. I am guessing this must not be a gauge graph since it always says current = 0. This is not a bug, more of a feature request :)

<edit>I just reread your past post. I have the 'current' shows up as '0' even after upgrading. The debug is on....

I have this ins cactilog:
02/09/2005 10:53:38 AM - CMDPHP: Poller[0] Host[13] SNMP: v1: 10.200.1.1, dsname: cisco_tempcur, oid: .1.3.6.1.4.1.9.9.13.1.3.1.3.1, output: 26

This in thresholf log:
element: HL_Dialin - 5 Minute Temperature alertstat: 0 elementid: 98 threshld: 15 rra: 116 trigger: 1 triggerct: 0 current:
</edit>

Also can you send mail to more that on person? I have setup an mailing list but was curious.

Thank you again for all your hard work.

-Matt
www.mattsshack.com
mdw162
Posts: 23
Joined: Fri Feb 04, 2005 7:49 pm

Post by mdw162 »

Yes, you can email multiple people at the same time -- just use a comma-separated list in the input box. It worked for me.

I still think there are a lot of issues with the grepping of the log files, incluing the 'current value=0" issue. Here are some fixes to all the issues I've been having. It ain't pretty but everything works perfectly for me now. All changes are to "check-thold.php."

1. I figured out how to avoid relative paths in the include statements. This will make the calling path irrelevant. In check-thold.php:

Change
include("../include/config.php");
to
include(dirname(__FILE__) . "/../include/config.php");

2. You can use
$hostname = $_SERVER['SERVER_NAME'];
instead of
$hostname = $_SERVER["HOSTNAME"];
so you don't have to use
$hostname = exec("hostname");

3. Okay, here's the log-file-grep replacement. This queries the rra directly.

First, remove this line:
shell_exec("mv $logfile $newlogfile");

Second, replace this line
$currentval = shell_exec("grep _$rra.rrd $newlogfile | $cactibasedir/thold/parse.sh");
with this:
$rrdtoolpath = read_config_option("path_rrdtool");
$cactiroot = $config["base_path"];
$last_time_entry = exec("$rrdtoolpath last $cactiroot/rra/*_$rra.rrd MAX");

// If we have no real value in the last 900 seconds, treat as threshold-exceeded
// See additional || (OR) condition to the IF statement below
$last_needed = $last_time_entry + 900;

$current_time = time();
$output = `$rrdtoolpath fetch $cactiroot/rra/*_$rra.rrd MAX -s $last_time_entry | \
grep -vi nan | tail -1 | awk '{print $2}' `;
$currentval = round ($output);


Last, in the first "IF" statement change
if ($currentval > $threshld)
to
if ($currentval > $threshld || $last_needed < $current_time)

That's it. Should work great! Let me know if there are any issues.
Locked

Who is online

Users browsing this forum: No registered users and 2 guests