[SOLVED] Boost "On Demand RRD" graph gaps

General discussion about Plugins for Cacti

Moderators: Developers, Moderators

Post Reply
jimj
Posts: 7
Joined: Wed Jun 24, 2009 4:31 pm

[SOLVED] Boost "On Demand RRD" graph gaps

Post by jimj »

When using boost my graphs only update at the boost update time. For example my poller time is set to 1 minute and boost is set to update RRDs every 30 minutes. With this configuration every 30 minutes I get 1 minute's worth of data graphed. It seems like all the data that boost gathered is thrown away and not used (except for the last minute's worth).

If I change the boost "Maximum Records" down from 100000 to 100 the graphs get updated correctly every minute. This obviously negates all the benefits of running boost's "On Demand RRD Update" feature, but maybe this info will help someone smarter than me narrow down the source of the problem?

I'm using:
CactiEZ v0.6
Cacti v0.8.7c
Boost 2.4

I've set "max_heap_table_size=512M" in /etc/my.conf (but I don't think this had anything to do with my problem).

This is a new Cacti server install, so I only have a few dozen devices defined right now. However I anticipate this growing to 200 soon (once I get boost working).


When I have boost set to update every 100 records my log looks like this:
06/24/2009 05:43:06 PM - SYSTEM STATS: Time:5.2904 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 05:43:07 PM - SYSTEM BOOST STATS: Time:0.4406 RRDUpdates:240
06/24/2009 05:44:08 PM - SYSTEM STATS: Time:6.3135 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 05:44:08 PM - SYSTEM BOOST STATS: Time:0.4396 RRDUpdates:237
06/24/2009 05:45:07 PM - SYSTEM STATS: Time:5.2940 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 05:45:07 PM - SYSTEM BOOST STATS: Time:0.4789 RRDUpdates:235
06/24/2009 05:46:06 PM - SYSTEM STATS: Time:5.2862 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 05:46:07 PM - SYSTEM BOOST STATS: Time:0.2646 RRDUpdates:238
06/24/2009 05:47:07 PM - SYSTEM STATS: Time:5.2856 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 05:47:08 PM - SYSTEM BOOST STATS: Time:0.5234 RRDUpdates:239
06/24/2009 05:48:07 PM - SYSTEM STATS: Time:5.2839 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 05:48:07 PM - SYSTEM BOOST STATS: Time:0.2561 RRDUpdates:240


When boost is set to update every 100,000 records (or 30 minutes, which comes first) the log shows 30 minutes of 0 RRDsProcessed, followed by one big update. Isn't this the way is should look? My relevant logs are below:
06/24/2009 02:36:06 PM - SYSTEM STATS: Time:5.2730 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:37:06 PM - SYSTEM STATS: Time:5.2635 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:38:07 PM - SYSTEM STATS: Time:5.2876 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:39:06 PM - SYSTEM STATS: Time:5.2638 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:40:06 PM - SYSTEM STATS: Time:5.2649 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:41:07 PM - SYSTEM STATS: Time:5.2715 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:42:08 PM - SYSTEM STATS: Time:6.2705 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:43:06 PM - SYSTEM STATS: Time:5.2560 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:44:06 PM - SYSTEM STATS: Time:5.2597 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:45:07 PM - SYSTEM STATS: Time:5.2549 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:46:07 PM - SYSTEM STATS: Time:5.2611 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:47:07 PM - SYSTEM STATS: Time:6.2512 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:48:07 PM - SYSTEM STATS: Time:5.2753 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:49:07 PM - SYSTEM STATS: Time:5.2672 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:50:06 PM - SYSTEM STATS: Time:5.2973 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:51:06 PM - SYSTEM STATS: Time:5.2764 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:52:08 PM - SYSTEM STATS: Time:6.2747 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:53:07 PM - SYSTEM STATS: Time:5.2945 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:54:06 PM - SYSTEM STATS: Time:5.2596 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:55:07 PM - SYSTEM STATS: Time:5.2714 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:56:07 PM - SYSTEM STATS: Time:5.2655 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:57:06 PM - SYSTEM STATS: Time:5.2655 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:58:06 PM - SYSTEM STATS: Time:5.2837 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:59:07 PM - SYSTEM STATS: Time:5.2629 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 03:00:07 PM - SYSTEM STATS: Time:5.4623 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 03:01:06 PM - SYSTEM STATS: Time:5.2699 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 03:02:06 PM - SYSTEM STATS: Time:5.2642 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 03:03:07 PM - SYSTEM STATS: Time:5.2645 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 03:04:07 PM - SYSTEM STATS: Time:5.2641 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 03:05:06 PM - SYSTEM STATS: Time:5.2660 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 03:06:06 PM - SYSTEM STATS: Time:5.2550 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 03:07:08 PM - SYSTEM STATS: Time:6.2582 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 03:07:11 PM - SYSTEM BOOST STATS: Time:2.6430 RRDUpdates:7144
Attachments
Boost Status
Boost Status
boostStatus.jpg (119.32 KiB) Viewed 5107 times
Tech Support
Tech Support
techSupport.jpg (119.43 KiB) Viewed 5107 times
Last edited by jimj on Mon Jul 06, 2009 10:14 am, edited 1 time in total.
jimj
Posts: 7
Joined: Wed Jun 24, 2009 4:31 pm

My problem is related to rrd file permissions

Post by jimj »

I've narrowed my problem down to a problem with permissions. The problem is that when I create new devices and graph new interfaces the new rrd files are created with "root" as the owner and "apache" as the group. If I chown the rrd files to set "apache" as the owner everything works. So I have a work around, manually chowing the rrd files every time I add new data sources, but surely there's a better way than this. Any ideas on what I'm doing wrong?

The rra directory permission is set to:
drwsr-sr-t 2 apache apache 12288 Jun 25 15:21 rra

Here are the permissions on two of my rrd files. The first set to "apache apache" works find. But the 2nd rrd file set to "root apache" does not work.
-rw-r--r-- 1 apache apache 1.2M Jun 25 15:41 localhost_users_24.rrd
-rw-r--r-- 1 root apache 1.2M Jun 25 15:21 ssm_dr01_5min_cpu_191.rrd
jimj
Posts: 7
Joined: Wed Jun 24, 2009 4:31 pm

How do I get my rrd files to be owned by apache automaticall

Post by jimj »

In case I wasn't clear in my last post, I think my basic question at this point is "how do I get my rrd files to be owned by apache automatically?"

TIA
jimj
Posts: 7
Joined: Wed Jun 24, 2009 4:31 pm

SOLVED: How do I get my rrd files to be owned by apache

Post by jimj »

jimj wrote:In case I wasn't clear in my last post, I think my basic question at this point is "how do I get my rrd files to be owned by apache automatically?"
I've solved my issue. With a default CactiEZ v0.6 install the poller runs from root's crontab as root. This causes all the RRD in "/var/www/html/rra" to be owned by root. This is fine until you enable boost. With boost these files must be owned by user "apache". To fix this I ran 'crontab -e' and made changed this line:
*/1 * * * * php /var/www/html/poller.php > /dev/null 2>&1
to this:
*/1 * * * * sudo -u apache php /var/www/html/poller.php > /dev/null 2>&1

I also ran 'chown apache:apache /var/www/html/rra/*'

I know that running 'sudo -u' from root's crontab isn't the cleanest way to solve this problem, but I wanted to stay as close as possible to a default CactiEZ install. I.e. changing the location of cron entries might confuse my co-workers.
Chrysalis
Cacti User
Posts: 70
Joined: Fri Sep 19, 2008 10:14 am
Location: UK

Post by Chrysalis »

I ended up using just boost png cache feature, in addition I am using the old boost 1.8 cvs code, 2.4 has the same bug that 1.7 inflicted me with.

This plugin certianly seems buggy.

on 2.4 no matter what options I set I get 0 rrd processed by my poller and the poller gets killed after 298 seconds exceeding process limit. In short the poller hangs and process sits there until hits 5 minute cutoff limit.

On 1.8 cvs the poller works but if I enable on demand I get the same graphs you mentioned. With it turned off there seems to be no noticeable impact for me anyway server load wise.
jimj
Posts: 7
Joined: Wed Jun 24, 2009 4:31 pm

No speed gain with boost

Post by jimj »

Chrysalis wrote:With it (boost) turned off there seems to be no noticeable impact for me anyway server load wise.
Your post got me wondering how much of a speed improvement I'm getting from boost. Much to my surprise when I disabled the boost server my poller runtime stayed the same! I average 28.5 seconds to graph 890 data sources with or without boost on demand RRD updating enabled.

Until your post (which got me to check my boost performance gain) I haven't see any other negative boost posts. AFAIK boost is setup correctly. My rrd files only get updated every 30 minutes (unless I'm viewing them of course).

I thought boost was supposed to cut down your polling time by an order of magnitude by eliminating the disk IO every polling cycle. Did I completely misunderstand the point of "Enable On Demand RRD Updating"? Is there any other simple thing I could've missed that would cause this?
Chrysalis
Cacti User
Posts: 70
Joined: Fri Sep 19, 2008 10:14 am
Location: UK

Re: [SOLVED] Boost "On Demand RRD" graph gaps

Post by Chrysalis »

I am back posting as very little feedback about this plugin gets posted.

I updated cacti which then requires a new boost plugin, and the PIA site still showed 2.4 as the latest version, however their documentation site shows newer version eg. boost 4.2.1. So I installed that version and the problem I had in 2.4 is gone and in addition it has better logging.

Boost now logs how long it takes to make the graphs, if I disable on demand updating so it makes the graphs every time the data sources run then this is how long it takes.

22secs to query the data (average), probably with some timeouts bumping the time up as some dormant data sources.
0.5secs to make the graphs.

on the old boost the 2nd figure doesnt get reported. So if I understand right if I were to enable on demand updating I would just save that 0.5s on the 5 minute updates, not really worth it in my case. I then create a downside however in that every 30 minutes (or higher if configured higher) it will do new graphs with a large chuck of data at once adding a bigger load spike and risk of error. This plugin clearly designed for hosts that must be managing 10s of thousands of graphs, as I have over 300 and it only takes half a second. Nevertherless I still use it for the png caching feature.
User avatar
TheWitness
Developer
Posts: 17053
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: [SOLVED] Boost "On Demand RRD" graph gaps

Post by TheWitness »

Goto docs.cacti.net for updated versions.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests