Hard to get Cacti to graph count of rare events accurately

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
JoelKatz
Posts: 4
Joined: Tue Nov 03, 2009 11:42 pm
Location: Murphys, CA
Contact:

Hard to get Cacti to graph count of rare events accurately

Post by JoelKatz »

I'm using Cacti to graph 'rare' events. I have a custom script that grabs the event count for the preceding minute and passes it to Cacti. I use one minute intervals for everything.

The problem is that Cacti graphs this very oddly. For example, suppose Cacti polls at 15:36:55 and gets 0 (because there were no events in the preceding minute) then at 15:37:55 it gets 1 (because there was one), instead of reporting a 0 for 15:37 and 1 for 15:37, it will report like .95 for 15:37 and .05 for 15:38, because it interpolates the time of the event by weighing both data points (the one and the zero).

This looks very silly on the graph, and when you zoom in, a single discrete event looks like a stair step, and the 'max' value is useless.

Also irritating, there's no way to get a 'total' on a graph. It would be nice if you could zoom in on a time range of interest and get the total count for that range.

Think about logging, say, errored seconds on a T1. Most of the time, there are none. But if you zoom in on an interval where there are some, the total number for that interval would be useful to see on the graph.

Is there any way to make discrete event counts graph sanely with Cacti?
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please see 1st link of my sig, chapter on rrdtool.
The issue you're asking for is a "feature" of rrdtool. Cacti uses rrdtool as a data store and a graph generator. And there are some good reasons why it works this way.
But there are rumours, that Tobi Oetiker will add a feature to cope with such a request. I'm not sure whether this has already been published
R.
JoelKatz
Posts: 4
Joined: Tue Nov 03, 2009 11:42 pm
Location: Murphys, CA
Contact:

Post by JoelKatz »

So is my best solution, at least for now, not to use Cacti's poller? I can just insert my own counts into the RRD and use the timestamp of the test interval, not the actual time the query is done.

I think the simplest way to get what I want easily would be a Cacti option to take a parameter from the script that would override the timestamp it passes to rrdtool. It would be awesome if Cacti could tell me the last timestamp that I gave it and accept more than one data/timestamp set, but perhaps that's asking for too much.

That would be an awesome feature though. A new 'script' type that is passed the last timestamp Cacti got from that script and can return more than one line, each containing a timestamp and a data set. Cacti would insert the eache line of data return into the RRD with the timestamp provided. The poller would cache the timestamp of the last line of data returned to pass back to the script.

This would be great for log processing scripts as well as error count scripts.

Thanks for the prompt reply!
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

JoelKatz wrote:I think the simplest way to get what I want easily would be a Cacti option to take a parameter from the script that would override the timestamp it passes to rrdtool. It would be awesome if Cacti could tell me the last timestamp that I gave it and accept more than one data/timestamp set, but perhaps that's asking for too much.
That was indeed already discussed but not yet implemented. I'm not even sure that such a mantis feature request exists, see http://www.cacti.net/bugs.php
R.
JoelKatz
Posts: 4
Joined: Tue Nov 03, 2009 11:42 pm
Location: Murphys, CA
Contact:

I found a simpler solution, and another problem case.

Post by JoelKatz »

I noticed substantially the same problem graphing server response times. My server response times are typically around 1ms, but there's the occasional 2000ms response.

The way cacti works, if that single 2000ms probe response occurs at say, 10:35:00, I see in the graph 2ms, 2000ms, 2ms. If it occurs at 10:37:30, I see in the graph 2ms, 1001ms, 1001ms, 2ms.

So the handling of a single anomalous reading is similarly affected by this "problem". Having thought about it, I think Cacti should have an "exact time" option that causes Cacti to automatically report the event to the RRD with the time rounded to the nearest step, one more than the previous on normal conditions. This will prevent any interpolation or rounding.

As a bonus, it guarantees that the RRD contains the actual sampled data rather than interpolated data. IMO, this is sometimes valuable for archiving purposes. (Though, in fairness, it means the RRD has less accurate time information, but you can argue it's not intended to store times more accurate than the step.)
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

I'm not sure that your statement is correct.
But please know, that rrdtool will have (or alraedy has, I recently may have missed it) a new option to suppress the time adjustment to polling intervals. From my point of view, both issues will have to be taken into account.
R.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest