RELEASED: < 1 Minute to 5 Minute Polling Interval Patch

Important information about Cacti developments that all users should be interested in.

Moderators: Developers, Moderators

Post Reply
User avatar
TheWitness
Developer
Posts: 16997
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Cacti 0.8.7 has this feature included. We are still working on some bugs in 0.8.7 and will release 0.8.7a shortly. We have a CC this weekend.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
joesatch66
Posts: 6
Joined: Wed Jun 27, 2007 8:31 am
Location: Paris

Poller & Cron

Post by joesatch66 »

Hi,

I'm using Cacti 0.8.7 with PluginArchitecture on Debian. My question is, what is the link between polling interval in the poller tab and cron interval in cron ?
This is not really clear to me, and I've read a lot on Cacti forums.
I've seen that generally the interval in cron should be left to 5 min (should it really ?), whatever may the interval be in the poller tab setting.
But I've noticed that when the cron interval is set to 5 min and poller interval to 30 seconds, graphing get stopped. (This is not due to data sources problems or snmp or anything else since my system works quite well).
I've also noticed that when cron interval is set to 1 min and poller interval is still set to 30 seconds, now the graphs keep graphing.
Like you may have understand I would like to poll my devices each 30 seconds (my boss will feel happy ). .
In order to have this working I've modified data templates, to have the step set to 30 (seconds), and the heartbeat set to 60 (2*30 as it's suggested), and associated the 'Hourly' rra to this data template.


So, to sum up:

i)poller=30 sec + cron=5 min ==> no graphing
ii)poller=30 sec + cron=1 min ==> graphing ok, but is data that is graphed the real 30 seconds interval data ? I mean, am I sure that what is graphed has been retrieved every 30 seconds ? I've checked in the log poller stat that it's running every 30 seconds so I think that's ok, but I'm not really sure..

Therefore my last question is how can we set the desired polling interval, using cron interval and cacti poller interval ? Especially for a 30 seconds interval which may interest many people, since it's a good interval to poll many devices in a production environment.

Thank you very much for your help !
joesatch66
Posts: 6
Joined: Wed Jun 27, 2007 8:31 am
Location: Paris

Post by joesatch66 »

Hi,

I have another question concerning RRA defintions: the 'Hourly' RRA Timespan is set to 14400. And 14400 = 240*60 = 240 min = 4 hours, so this is not really 'Hourly' but '4-Hourly', am I wrong ?

If I want to have a Yearly RRA with no consolidation (no data loss: Steps=1), and with 30 seconds PDPs (this is my poller interval), I should define a new RRA like this :
-Steps=1
-Rows=1152000
-Timespan=33053184
is that all right ? I know this makes huge rrd files, but that's ok for me.

Thank you very much.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

You are aware of the fact, that this granularity will only be shown when zooming deep into the graphs for timespans long ago?
In other words: At first glance, this will have virtually no effect on your graphs.

Second issue: When defining a single rra only, you will see a single timespan only when selecting that graph (currently, you will see 4 graphs: daily, weekly, monthly, yearly). This is due to the close coupling between rra definitions and graph timespan. This will not change until 088
Reinhard
joesatch66
Posts: 6
Joined: Wed Jun 27, 2007 8:31 am
Location: Paris

Post by joesatch66 »

And for a Hourly RRA with no data loss (no consolidation) and 30 seconds PDPs, I should define :

-Steps=1 (no consolidation)
-Rows=120 (1 PDP measured each 30 seconds, so that's 120 PDPs in 1 hour)
-Timespan=3600 (hourly)

Am I on the good direction to have different RRAs based on a 30 seconds interval ?

Thank you !
joesatch66
Posts: 6
Joined: Wed Jun 27, 2007 8:31 am
Location: Paris

Post by joesatch66 »

Yes, I am aware that this granularity will only be shown when zooming deep into the graphs for timespans long ago: this can be interesting for capacity planning, establishing tendancies.
So I think I am on the good way to have my personalized RRAs.

Yes you are right, when you define only one RRA you see a single timespan only. I've noticed this... but that's ok if I define several RRA, not only one ? The timespans will not be all displayed, only some of them, that's not a problem to me.

Thanks
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

joesatch66 wrote:Yes, I am aware that this granularity will only be shown when zooming deep into the graphs for timespans long ago: this can be interesting for capacity planning, establishing tendancies.
Ok, so you're aware of the drawbacks, fine.
But I do not understand, if you're aiming at capacity planning, why you want to have 30 sec intervals for a whole year.
I do understand, that 30 sec interval will show MAXIMUM values more precise compared to a 5 min interval. They will be higher compared to default rra settings. That's settled.
But when consolidating both AVERAGE and MAXIMUM, you will be able to deduce volume capacity information (AVERAGE) and peak capacity consumption (MAXIMUM) as well. MAXIMUM consolidation will not loose the exact overall MAXIMUM value of the consolidation interval. It will only loose the exact time when this MAXIMUM occured! In most cases, for capacity issues this will be fine. Only for debugging purpose, when it is crucial to correlated events on the timeline, consolidation will loose vital information.
Surely, you may simply operate your rrd's as described. I only want to understand.
Reinhard
joesatch66
Posts: 6
Joined: Wed Jun 27, 2007 8:31 am
Location: Paris

Post by joesatch66 »

Maybe I do not get to understand something, let me explain: setting 'Steps' field to 1 in RRA definition mean you don't use consolidation at all. But you need anyway to fill in the 'Consolidation Functions' field (average, min, max or last) which means that you do use a consolidation function, so what's the point, either you use it or not ? If you set 'Steps' to 1 and in addition select one or several consolidation functions wouldn't there be a bug, some kind of conflict ?

But I do not understand, if you're aiming at capacity planning, why you want to have 30 sec intervals for a whole year.
Well this is about 'needs definitions', coming from my boss: he wants to have a 30 seconds granularity for at leats the 3 past months (maybe 6 months, not a whole year in fact).

But when consolidating both AVERAGE and MAXIMUM, you will be able to deduce volume capacity information (AVERAGE) and peak capacity consumption (MAXIMUM) as well. MAXIMUM consolidation will not loose the exact overall MAXIMUM value of the consolidation interval. It will only loose the exact time when this MAXIMUM occured! In most cases, for capacity issues this will be fine. Only for debugging purpose, when it is crucial to correlated events on the timeline, consolidation will loose vital information.
Well in fact I used your tutorial (very well done by the way !) "Howto define a very BIG rra without data loss" located here:
http://docs.cacti.net/node/54
where it's suggested to only use the AVERAGE consolidation function. But I don't understand why using only this one since in all default RRAs, both AVERAGE and MAXIMUM are selected. The few I understand is that if I use only average I will not lose the maximum values time occurrence, but I will lose the average values time occurrence. I can't understand what is the consequence to this... I've just been thinking of this: the best for me would be to select no consolidation functions in RRAs definitions so that only exact values are stored and graphed. Would this be smart or not ?

Anyway, thank you very much for your answers, it helps me going fast in my Cacti deployment
Last edited by joesatch66 on Sun Nov 04, 2007 10:19 am, edited 1 time in total.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

joesatch66 wrote:Maybe I do not get to understand something, let me explain: setting 'Steps' field to 1 in RRA definition mean you don't use consolidation at all. But you need anyway to fill in the 'Consolidation Functions' field (average, min, max or last) which means that you do use a consolidation function, so what's the point, either you use it or not ? If you set 'Steps' to 1 and in addition select one or several consolidation functions wouldn't there be a bug, some kind of conflict ?
RRDTool simply requires any consolidattion function. Your case is not the primary goal of rrdtool, to be honest. And IMHO cacti requires at least AVERAGE to be chosen. There was an error with selecting sth different, if I remember correctly.
But I do not understand, if you're aiming at capacity planning, why you want to have 30 sec intervals for a whole year.
Well this is about 'needs definitions', coming from my boss: he wants to have a 30 seconds granularity for at leats the 3 past months (maybe 6 months, not a whole year in fact).
Sigh!

I've just been thinking of this: the best for me would be to select no consolidation functions in RRAs definitions so that only exact values are stored and graphed. Would this be smart or not ?
Ad 1) See above, you'll have to choose at least one consolidation function. Ad 2) At least when updating not at exact 300 sec boundaries, rrdtool will "smooth up" your data while adjusting to a 300 sec boundary. No chance to get rid of that.

Reinhard
ahhdem
Posts: 18
Joined: Thu Jun 14, 2007 1:41 pm

Post by ahhdem »

I am using cacti 0.8.7 with cactid 0.8.7 to implement 1 minute polling interval on my graphs. Currently I have one graph running at an interval of */1, the rest have their steps etc. defined at the typical 5 minute ranger.

This has had bad adverse affects on the 5minute graphs, with white-out spots all over every graph (and with great inconsistancy across them). My assumption is simply that the poller is not finishing inside of 1minute, and the 5th minute gets stomped on before it finishes, thus losing some values?

What worries me is that I am only testing this on a small environment with 20 hosts and 90 DS. I was hoping to implement this into our live environment, however we have more like 90 hosts and 1500 DS'.

Here are the stats from 0.8.7, from a 1minute update:

11/02/2007 02:18:04 PM - SYSTEM STATS: Time:3.1218 Method:spine Processes:1 Threads:3 Hosts:20 HostsPerProcess:20 DataSources:90 RRDsProcessed:1
Loop Time is: 3.1227629184723
Sleep Time is: 56.877237081528
Total Time is: 183.12675189972

The total time is actually -longer- than it takes on my production machine, and it's half the machine im currently testing on, and supports twice the average load (0.2on the current test box, 0.5/8 on the prod box).

I'm very confused. I am using cactid 0.8.6 on the production machine. The only difference is the poller interval frequency, which leads me to my final real concern/question.

Cacti seems to be collecting data from -all- the hosts and services that are still, not just those configured for a 1 minute interval, every time it runs. This seems horribly inefficient. If a host is at a 5 minute interval, and it has not been at -least- 4 minutes from the previous polling, than, in my opinion, it should not be polled.

so, basically
if (time_polled + step) < now(); continue
else poll

does this make sense? Is this the way it should function and I have done something improper? from my experience so far the only solution for leveraging cacti for both trend monitoring as well as debugging is to implement a seperate 'real-time' cacti where all graphs are at 2min< frequency.

Any advice/insight is much appreciated!

-Adam
ahhdem
Posts: 18
Joined: Thu Jun 14, 2007 1:41 pm

Post by ahhdem »

also, here is the output of the same poller (spine-0.8.7) running with the full DS' being processed into RRDS:

11/02/2007 02:27:31 PM - SYSTEM STATS: Time:30.2963 Method:spine Processes:1 Threads:3 Hosts:20 HostsPerProcess:20 DataSources:71 RRDsProcessed:72
Loop Time is: 30.297376871109
Sleep Time is: 29.702623128891
Total Time is: 270.30430388451


again this machine is largely bored, a dual proc dual core Opteron sun x2200 with 4GB ram.

[root@otis ~]# uptime
14:29:07 up 24 days, 4:06, 27 users, load average: 0.03, 0.03, 0.00
[root@otis ~]# free
total used free shared buffers cached
Mem: 4151468 1076956 3074512 0 509604 99316
-/+ buffers/cache: 468036 3683432
Swap: 2097144 180 2096964
[root@otis ~]#
cigamit
Developer
Posts: 3363
Joined: Thu Apr 07, 2005 3:29 pm
Location: B/CS Texas
Contact:

Post by cigamit »

Rebuild your poller cache. It should split up all your 5 minutes graphs over the 5 pollings that occur during that 5 minutes.

Also, I think you might be having an issue with multiple pollers running at the same time. Not because of your cron, but because of a bug in the way Cacti determines your cron interval. If you manually hardcode the cron_interval to 60 in poller.php right before this line it should help.

/* assume a scheduled task of either 60 or 300 seconds */

Then kill all pollers currently still running, and you should be good to go.
ahhdem
Posts: 18
Joined: Thu Jun 14, 2007 1:41 pm

Post by ahhdem »

my hero! *swoon*

11/02/2007 03:50:20 PM - SYSTEM STATS: Time:19.2399 Method:spine Processes:1 Threads:3 Hosts:20 HostsPerProcess:20 DataSources:54 RRDsProcessed:54
Loop Time is: 19.241475105286
Sleep Time is: 40.7568359375
Total Time is: 19.2431640625


11/02/2007 03:51:21 PM - SYSTEM STATS: Time:20.2419 Method:spine Processes:1 Threads:3 Hosts:20 HostsPerProcess:20 DataSources:55 RRDsProcessed:55
Loop Time is: 20.24334192276
Sleep Time is: 39.754957914352
Total Time is: 20.245042085648



This isnt actually going to be inserting into each rrd every minute, correct? each one says RRDsProcessed 54 or 55 each time. Just want to be sure.


Thanks for the quick response!!
ashfieldm
Posts: 25
Joined: Tue Oct 11, 2005 8:47 am

Post by ashfieldm »

Is there any HOWTO for this feature? Can I get 30-second polling to work with this now?
ashfieldm
Posts: 25
Joined: Tue Oct 11, 2005 8:47 am

Post by ashfieldm »

Still trying to figure this out.

If i have existing devices that I was graphing before the upgraded to 0.87a that i now want to start graphing at 30 second intervals, how do I g about doing that? DO I have to blow away everything and start over?

What about the 1 minute interval graph? By simply adding that, am i actually getting 1 minute data steps?

Again, if anyone has any howto or something, espcially details about how to migrate from old version of cacti to using this feature it would be much appreciated.

Thanks
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest