Baseline monitoring information

Support questions about the Threshold plugin

Moderators: Developers, Moderators

Post Reply
gabar
Posts: 27
Joined: Mon Jun 13, 2005 3:53 am
Location: Rome
Contact:

Baseline monitoring information

Post by gabar »

Hi to all,
I tried to get some information about "thold baseline monitoring" reading many messages on this forum but I didn't find nothing.
I have some questions about this feature:

1) where can I find some kind of documentation about thold plugin (if exists)?
2) someone can explain me the meaning of "baseline monitoring": I understand that with this feature is possible to trigger some alarms notifications when traffic is "strange" compared to some values on the past (typically default values are the last 3 hours (10800 seconds) of last day (86400 seconds), but I didn't understand which values are considered. In particular is possible configure this feature to show (inside the notification for example) the values considered as reference values. Can anyone explain how this feature works ?

Thank you very much in advance!
Last edited by gabar on Wed Jul 09, 2008 6:53 am, edited 1 time in total.
gabar
Posts: 27
Joined: Mon Jun 13, 2005 3:53 am
Location: Rome
Contact:

Any news about baseline monitoring?

Post by gabar »

Hi,
someone is using baseline monitoring with success? Does it work?

Thanks
semakka
Posts: 20
Joined: Mon Oct 23, 2006 6:37 am

Post by semakka »

obliously not

cheers
bengelly
Cacti User
Posts: 57
Joined: Fri Jan 26, 2007 2:28 am

Post by bengelly »

Hi,

yes the baselining function is very strange indeed... I can't seem to understand it's working process...

could someone give us more details or even examples (screenshots,etc...) ?

Thanks
wayno
Posts: 14
Joined: Tue Apr 10, 2007 5:59 pm
Location: Darwin, Australia

Post by wayno »

Add me to this list too :)

I have been experimenting a bit with the settings and have had something from it. I set the baseline deviations in the settings and have had a handful of alerts based on these figures (I have no hard thresholds set) but I would like to be able to see the baseline it has calculated and the dynamic thresholds it uses.

Is it possible to get this into a future release or does anybody have a script that can do it?
Attachments
Note the baseline deviation up / down settings. I've had some alerts from THold based on these settings
Note the baseline deviation up / down settings. I've had some alerts from THold based on these settings
thold.jpg (241.36 KiB) Viewed 13135 times
Criggie
Posts: 16
Joined: Sat Jul 21, 2007 4:30 am
Location: Christchurch, New Zealand
Contact:

Baselining

Post by Criggie »

I've got threshholding working fine, but not baselining. I've tried dropping my windows down to 600 seconds, but nothing ever shows that a baseline has been calculated.

Should there be a line drawn on the graph?
jjhans
Posts: 8
Joined: Thu Jan 03, 2008 10:40 am

Post by jjhans »

After a bit of experimenting, here's my experience with baselining.

The first person to post in this topic was 90% there. Baselining looks back a certain amount of time in the past (by default, 24 hours), and grabs a sample of data from then (by default, 3 hours' worth) to use as a "baseline" value. In particular, I believe that it grabs the minimum and maximum values from that period. Then, if the current value is more than, say, 10% higher than the maximum of the sampled period, the threshold is breached.

That's the basic idea... I haven't worked out all the details, but it's enough to get me started with thresholds.

For an example, check out the attached graph, showing a firewall's cpu usage over the last 48 hours. You can see that at about this time yesterday, the cpu was showing no more than 1% usage. For the last few hours, it's been at almost 20% usage, which is more than the 10% threshold I had set. This triggered the threshold to fire and send me an email, which inspired me to search the forums and see if anybody had any good tips on how to use thresholds, and I found this topic. :)
Attachments
firewall_cpu.png
firewall_cpu.png (24.98 KiB) Viewed 11710 times
jjhans
Posts: 8
Joined: Thu Jan 03, 2008 10:40 am

Post by jjhans »

There's another thing to consider about baseline thresholds... they don't necessarily know anything about the data they're analyzing, which may lead to unexpected results.

Here's what I mean: If you're looking at, say, data on an interface, and you set the threshold to 20% above the baseline, then the threshold will trigger if the current traffic is 20% higher than the largest spike around this time yesterday (assuming you use the default settings). This does what you expect it to.

However, if you're monitoring something like processor usage, it's more complicated. Suppose again that the threshold is set to 20% above the baseline. If the device was running at 10% processor usage yesterday, you might expect that it needs to be at 30% today in order to trigger the threshold, because that's an extra 20%. However, the threshold will actually trigger at only 12%... because the number 12 is 20% higher than the number 10!
leonardo_gyn
Cacti User
Posts: 85
Joined: Sat Jan 22, 2005 4:51 pm

Post by leonardo_gyn »

i'm still trying to figure out how baseline works exactly.

but it seems to me that using baseline for monitoring things that can vary very much, like traffic interface and cpu usage, is not a good idea.

i'm studying baseline monitoring because i want to monitor disk usages with it. Disk usages, different from traffic interface and cpu usage, does not use to vary very much in a short period of time. It seems to me that this is the situation for baseline monitoring.

at it was discussed above, not all values are suitable for baseline monitoring, as i understand it so far. baseline monitoring some values, like cpu as stated, can trigger several false positives.
znapel
Posts: 9
Joined: Fri Nov 09, 2007 3:50 am

Thold Baselining

Post by znapel »

My $.02... I seldom use the baseline feature because of it's very nature... If I am watching CPU, disk, or networks I usually can set a threshold. If any of these things are less than a threshold I don't normally care. So my CPU is running 30% more today than yesterday, unless it's above a certain threshold, who cares? The one thing I have found it useful for is monitoring cable modem users. The reason it is helpful is that the number of people online is constantly changing because of growth or because users were being moved around to different equipment. I do care if the number of people is less than X% of the number of people online an hour ago... It's good for tracking a deviation from a constantly changing number. FWIW, to those who might not understand it. It takes some fiddling around to get a good window of stats sampled and such. And to be honest I don't understand it 100% myself.

That said, and I already saw a single post with no reply to this issue, but has anyone had an issue where an upper threshold is not set (left blank) but it alarms as having been exceeded anyways? It seems like a bug, but the code apparently hasn't been touched in like 10 months, so I don't think we're likely to see anything changed/fixed with it anytime soon..

Znapel
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests