When using boost my graphs only update at the boost update time. For example my poller time is set to 1 minute and boost is set to update RRDs every 30 minutes. With this configuration every 30 minutes I get 1 minute's worth of data graphed. It seems like all the data that boost gathered is thrown away and not used (except for the last minute's worth).
If I change the boost "Maximum Records" down from 100000 to 100 the graphs get updated correctly every minute. This obviously negates all the benefits of running boost's "On Demand RRD Update" feature, but maybe this info will help someone smarter than me narrow down the source of the problem?
I'm using:
CactiEZ v0.6
Cacti v0.8.7c
Boost 2.4
I've set "max_heap_table_size=512M" in /etc/my.conf (but I don't think this had anything to do with my problem).
This is a new Cacti server install, so I only have a few dozen devices defined right now. However I anticipate this growing to 200 soon (once I get boost working).
When I have boost set to update every 100 records my log looks like this:
06/24/2009 05:43:06 PM - SYSTEM STATS: Time:5.2904 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 05:43:07 PM - SYSTEM BOOST STATS: Time:0.4406 RRDUpdates:240
06/24/2009 05:44:08 PM - SYSTEM STATS: Time:6.3135 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 05:44:08 PM - SYSTEM BOOST STATS: Time:0.4396 RRDUpdates:237
06/24/2009 05:45:07 PM - SYSTEM STATS: Time:5.2940 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 05:45:07 PM - SYSTEM BOOST STATS: Time:0.4789 RRDUpdates:235
06/24/2009 05:46:06 PM - SYSTEM STATS: Time:5.2862 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 05:46:07 PM - SYSTEM BOOST STATS: Time:0.2646 RRDUpdates:238
06/24/2009 05:47:07 PM - SYSTEM STATS: Time:5.2856 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 05:47:08 PM - SYSTEM BOOST STATS: Time:0.5234 RRDUpdates:239
06/24/2009 05:48:07 PM - SYSTEM STATS: Time:5.2839 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 05:48:07 PM - SYSTEM BOOST STATS: Time:0.2561 RRDUpdates:240
When boost is set to update every 100,000 records (or 30 minutes, which comes first) the log shows 30 minutes of 0 RRDsProcessed, followed by one big update. Isn't this the way is should look? My relevant logs are below:
06/24/2009 02:36:06 PM - SYSTEM STATS: Time:5.2730 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:37:06 PM - SYSTEM STATS: Time:5.2635 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:38:07 PM - SYSTEM STATS: Time:5.2876 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:39:06 PM - SYSTEM STATS: Time:5.2638 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:40:06 PM - SYSTEM STATS: Time:5.2649 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:41:07 PM - SYSTEM STATS: Time:5.2715 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:42:08 PM - SYSTEM STATS: Time:6.2705 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:43:06 PM - SYSTEM STATS: Time:5.2560 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:44:06 PM - SYSTEM STATS: Time:5.2597 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:45:07 PM - SYSTEM STATS: Time:5.2549 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:46:07 PM - SYSTEM STATS: Time:5.2611 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:47:07 PM - SYSTEM STATS: Time:6.2512 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:48:07 PM - SYSTEM STATS: Time:5.2753 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:49:07 PM - SYSTEM STATS: Time:5.2672 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:50:06 PM - SYSTEM STATS: Time:5.2973 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:51:06 PM - SYSTEM STATS: Time:5.2764 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:52:08 PM - SYSTEM STATS: Time:6.2747 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:53:07 PM - SYSTEM STATS: Time:5.2945 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:54:06 PM - SYSTEM STATS: Time:5.2596 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 02:55:07 PM - SYSTEM STATS: Time:5.2714 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 02:56:07 PM - SYSTEM STATS: Time:5.2655 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 02:57:06 PM - SYSTEM STATS: Time:5.2655 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 02:58:06 PM - SYSTEM STATS: Time:5.2837 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 02:59:07 PM - SYSTEM STATS: Time:5.2629 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 03:00:07 PM - SYSTEM STATS: Time:5.4623 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 03:01:06 PM - SYSTEM STATS: Time:5.2699 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 03:02:06 PM - SYSTEM STATS: Time:5.2642 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 03:03:07 PM - SYSTEM STATS: Time:5.2645 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:235 RRDsProcessed:0
06/24/2009 03:04:07 PM - SYSTEM STATS: Time:5.2641 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:238 RRDsProcessed:0
06/24/2009 03:05:06 PM - SYSTEM STATS: Time:5.2660 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:239 RRDsProcessed:0
06/24/2009 03:06:06 PM - SYSTEM STATS: Time:5.2550 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:240 RRDsProcessed:0
06/24/2009 03:07:08 PM - SYSTEM STATS: Time:6.2582 Method:spine Processes:1 Threads:1 Hosts:40 HostsPerProcess:40 DataSources:237 RRDsProcessed:0
06/24/2009 03:07:11 PM - SYSTEM BOOST STATS: Time:2.6430 RRDUpdates:7144
[SOLVED] Boost "On Demand RRD" graph gaps
Moderators: Developers, Moderators
[SOLVED] Boost "On Demand RRD" graph gaps
- Attachments
-
- Boost Status
- boostStatus.jpg (119.32 KiB) Viewed 5107 times
-
- Tech Support
- techSupport.jpg (119.43 KiB) Viewed 5107 times
Last edited by jimj on Mon Jul 06, 2009 10:14 am, edited 1 time in total.
My problem is related to rrd file permissions
I've narrowed my problem down to a problem with permissions. The problem is that when I create new devices and graph new interfaces the new rrd files are created with "root" as the owner and "apache" as the group. If I chown the rrd files to set "apache" as the owner everything works. So I have a work around, manually chowing the rrd files every time I add new data sources, but surely there's a better way than this. Any ideas on what I'm doing wrong?
The rra directory permission is set to:
drwsr-sr-t 2 apache apache 12288 Jun 25 15:21 rra
Here are the permissions on two of my rrd files. The first set to "apache apache" works find. But the 2nd rrd file set to "root apache" does not work.
-rw-r--r-- 1 apache apache 1.2M Jun 25 15:41 localhost_users_24.rrd
-rw-r--r-- 1 root apache 1.2M Jun 25 15:21 ssm_dr01_5min_cpu_191.rrd
The rra directory permission is set to:
drwsr-sr-t 2 apache apache 12288 Jun 25 15:21 rra
Here are the permissions on two of my rrd files. The first set to "apache apache" works find. But the 2nd rrd file set to "root apache" does not work.
-rw-r--r-- 1 apache apache 1.2M Jun 25 15:41 localhost_users_24.rrd
-rw-r--r-- 1 root apache 1.2M Jun 25 15:21 ssm_dr01_5min_cpu_191.rrd
How do I get my rrd files to be owned by apache automaticall
In case I wasn't clear in my last post, I think my basic question at this point is "how do I get my rrd files to be owned by apache automatically?"
TIA
TIA
SOLVED: How do I get my rrd files to be owned by apache
I've solved my issue. With a default CactiEZ v0.6 install the poller runs from root's crontab as root. This causes all the RRD in "/var/www/html/rra" to be owned by root. This is fine until you enable boost. With boost these files must be owned by user "apache". To fix this I ran 'crontab -e' and made changed this line:jimj wrote:In case I wasn't clear in my last post, I think my basic question at this point is "how do I get my rrd files to be owned by apache automatically?"
*/1 * * * * php /var/www/html/poller.php > /dev/null 2>&1
to this:
*/1 * * * * sudo -u apache php /var/www/html/poller.php > /dev/null 2>&1
I also ran 'chown apache:apache /var/www/html/rra/*'
I know that running 'sudo -u' from root's crontab isn't the cleanest way to solve this problem, but I wanted to stay as close as possible to a default CactiEZ install. I.e. changing the location of cron entries might confuse my co-workers.
I ended up using just boost png cache feature, in addition I am using the old boost 1.8 cvs code, 2.4 has the same bug that 1.7 inflicted me with.
This plugin certianly seems buggy.
on 2.4 no matter what options I set I get 0 rrd processed by my poller and the poller gets killed after 298 seconds exceeding process limit. In short the poller hangs and process sits there until hits 5 minute cutoff limit.
On 1.8 cvs the poller works but if I enable on demand I get the same graphs you mentioned. With it turned off there seems to be no noticeable impact for me anyway server load wise.
This plugin certianly seems buggy.
on 2.4 no matter what options I set I get 0 rrd processed by my poller and the poller gets killed after 298 seconds exceeding process limit. In short the poller hangs and process sits there until hits 5 minute cutoff limit.
On 1.8 cvs the poller works but if I enable on demand I get the same graphs you mentioned. With it turned off there seems to be no noticeable impact for me anyway server load wise.
No speed gain with boost
Your post got me wondering how much of a speed improvement I'm getting from boost. Much to my surprise when I disabled the boost server my poller runtime stayed the same! I average 28.5 seconds to graph 890 data sources with or without boost on demand RRD updating enabled.Chrysalis wrote:With it (boost) turned off there seems to be no noticeable impact for me anyway server load wise.
Until your post (which got me to check my boost performance gain) I haven't see any other negative boost posts. AFAIK boost is setup correctly. My rrd files only get updated every 30 minutes (unless I'm viewing them of course).
I thought boost was supposed to cut down your polling time by an order of magnitude by eliminating the disk IO every polling cycle. Did I completely misunderstand the point of "Enable On Demand RRD Updating"? Is there any other simple thing I could've missed that would cause this?
Re: [SOLVED] Boost "On Demand RRD" graph gaps
I am back posting as very little feedback about this plugin gets posted.
I updated cacti which then requires a new boost plugin, and the PIA site still showed 2.4 as the latest version, however their documentation site shows newer version eg. boost 4.2.1. So I installed that version and the problem I had in 2.4 is gone and in addition it has better logging.
Boost now logs how long it takes to make the graphs, if I disable on demand updating so it makes the graphs every time the data sources run then this is how long it takes.
22secs to query the data (average), probably with some timeouts bumping the time up as some dormant data sources.
0.5secs to make the graphs.
on the old boost the 2nd figure doesnt get reported. So if I understand right if I were to enable on demand updating I would just save that 0.5s on the 5 minute updates, not really worth it in my case. I then create a downside however in that every 30 minutes (or higher if configured higher) it will do new graphs with a large chuck of data at once adding a bigger load spike and risk of error. This plugin clearly designed for hosts that must be managing 10s of thousands of graphs, as I have over 300 and it only takes half a second. Nevertherless I still use it for the png caching feature.
I updated cacti which then requires a new boost plugin, and the PIA site still showed 2.4 as the latest version, however their documentation site shows newer version eg. boost 4.2.1. So I installed that version and the problem I had in 2.4 is gone and in addition it has better logging.
Boost now logs how long it takes to make the graphs, if I disable on demand updating so it makes the graphs every time the data sources run then this is how long it takes.
22secs to query the data (average), probably with some timeouts bumping the time up as some dormant data sources.
0.5secs to make the graphs.
on the old boost the 2nd figure doesnt get reported. So if I understand right if I were to enable on demand updating I would just save that 0.5s on the 5 minute updates, not really worth it in my case. I then create a downside however in that every 30 minutes (or higher if configured higher) it will do new graphs with a large chuck of data at once adding a bigger load spike and risk of error. This plugin clearly designed for hosts that must be managing 10s of thousands of graphs, as I have over 300 and it only takes half a second. Nevertherless I still use it for the png caching feature.
- TheWitness
- Developer
- Posts: 17053
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: [SOLVED] Boost "On Demand RRD" graph gaps
Goto docs.cacti.net for updated versions.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Who is online
Users browsing this forum: No registered users and 2 guests