Boost Plugin v1.6 Released

Announcements concerning Plugins for Cacti

Moderators: Developers, Moderators

Locked
User avatar
gninja
Cacti User
Posts: 371
Joined: Tue Aug 24, 2004 5:02 pm
Location: San Francisco, CA
Contact:

Post by gninja »

Running this under linux (boost is running as user nobody, in single process mode).

I let it run overnight and this morning I saw the boost server is down error message, as well as errors updating rrds (minimum 1 second step error).

The rrd update error was only happening on some hosts, but would affect EVERY rrd for that host.

Telnetting to the boost server port and requesting status gave an ok message.

Restarting the boost server solved the problem, for now, but the not updated rrds now are gapped.

Any way to keep this from happening again?
FreeBSD/RHEL
cacti-0.8.7i, spine 0.8.7i, PIA 3.1+boost 5.1
MySQL 5.5/InnoDB
RRDtool 1.2.27, PHP 5.1.6
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Why do you run boost as nobody? It needs update access to the rrd files! What is your boost setup configured to?
Reinhard
User avatar
gninja
Cacti User
Posts: 371
Joined: Tue Aug 24, 2004 5:02 pm
Location: San Francisco, CA
Contact:

Post by gninja »

nobody is the owner of the rrd files and the png cache.
FreeBSD/RHEL
cacti-0.8.7i, spine 0.8.7i, PIA 3.1+boost 5.1
MySQL 5.5/InnoDB
RRDtool 1.2.27, PHP 5.1.6
User avatar
gninja
Cacti User
Posts: 371
Joined: Tue Aug 24, 2004 5:02 pm
Location: San Francisco, CA
Contact:

Post by gninja »

Boost config:

table type: memory
max table size: 512mb
update frequency: 1hr
server: enabled
single process
rrdtool binary: rrdtool (status page says this. config page says rrdupdate)
update timeout: 2 seconds
image caching: enabled.
FreeBSD/RHEL
cacti-0.8.7i, spine 0.8.7i, PIA 3.1+boost 5.1
MySQL 5.5/InnoDB
RRDtool 1.2.27, PHP 5.1.6
User avatar
gninja
Cacti User
Posts: 371
Joined: Tue Aug 24, 2004 5:02 pm
Location: San Francisco, CA
Contact:

Post by gninja »

Thought I had it working, and installed on a new host. And the boost server is never getting called (it's running and status returns ok). Running the rrdupdate manually I get a bunch of OKs, and then suddenly nothing but:

Code: Select all

ERROR: illegal attempt to update using time 1167430510 when last update time is 1167433208 (minimum one second step)
ERROR: illegal attempt to update using time 1167430510 when last update time is 1167433208 (minimum one second step)
ERROR: illegal attempt to update using time 1167430510 when last update time is 1167433208 (minimum one second step)
ERROR: illegal attempt to update using time 1167430510 when last update time is 1167433208 (minimum one second step)
FreeBSD/RHEL
cacti-0.8.7i, spine 0.8.7i, PIA 3.1+boost 5.1
MySQL 5.5/InnoDB
RRDtool 1.2.27, PHP 5.1.6
User avatar
gninja
Cacti User
Posts: 371
Joined: Tue Aug 24, 2004 5:02 pm
Location: San Francisco, CA
Contact:

Post by gninja »

Also seeing:

Code: Select all

12/30/2006 12:47:32 AM - BOOST SVR: Poller[0] ERROR: Detected Poller Boost Abend, Contact support
FreeBSD/RHEL
cacti-0.8.7i, spine 0.8.7i, PIA 3.1+boost 5.1
MySQL 5.5/InnoDB
RRDtool 1.2.27, PHP 5.1.6
User avatar
gninja
Cacti User
Posts: 371
Joined: Tue Aug 24, 2004 5:02 pm
Location: San Francisco, CA
Contact:

Post by gninja »

Ooooh. Ok, the illegal update messages that I'm seeing are on graphs that are (or have been) viewed. It looks like the updates when viewing the graph don't remove the entries from the database tables. I'll see about taking a look at the code later.
FreeBSD/RHEL
cacti-0.8.7i, spine 0.8.7i, PIA 3.1+boost 5.1
MySQL 5.5/InnoDB
RRDtool 1.2.27, PHP 5.1.6
ut0mt8
Posts: 4
Joined: Thu Jan 04, 2007 12:39 pm

Impressive !!

Post by ut0mt8 »

Woww ! Thanks a lot for this plugins.
It save my graph server from imminent death.

Before:

01/04/2007 06:23:47 PM - SYSTEM STATS: Time:224.9058 Method:cmd.php Processes:8 Threads:N/A Hosts:166 HostsPerProcess:21 DataSources:10421 RRDsProcessed:5838

After (with poller_output in ram) :

01/04/2007 06:26:01 PM - SYSTEM STATS: Time:59.5211 Method:cmd.php Processes:8 Threads:N/A Hosts:166 HostsPerProcess:21 DataSources:10421 RRDsProcessed:0
01/04/2007 06:34:03 PM - SYSTEM BOOST STATS: Time:182.0450 RRDUpdates:20710

I run cacti 0.8.6i on debian etch enabling boost server single process.
Xeon3ghz 512Meg

What an improvement, and what an excellent idea !
I have clearly identified that's my perf problem was on updating rra(s) on disk but didn't find any solution (perhaps buying faster disks and make a raid0).

Definitivly cool :)
ut0mt8
Posts: 4
Joined: Thu Jan 04, 2007 12:39 pm

Problem

Post by ut0mt8 »

After one day working well, the boost plugins failed.
In fact my poller_output_boost table was full (it was in RAM).
So I investigate and I found that some entry in this table were never flushed.

I suspect the code in boost_process_poller_output to be faulty when It check if the boost_rrdtool_function_update return. Not sure why. perhaps with faulty config.

Anyway I think a flush of the table is preferable at the end of the boost process. A warning message should indicate the numbers of line deleted.

Currently I investigate and try to propose a patch soon.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

I have seen/heard this issue from others. It would appear as if RRDtool is having difficulties on some platforms and crashing before all the updates are copmplete.

Therefore, I must advise that everyone be very cautious with this plugin. At least make sure that you can run the poller.php from the command line with boost disabled and get a clean run of nothing but "OK: ..." before you migrate to the plugin.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
ut0mt8
Posts: 4
Joined: Thu Jan 04, 2007 12:39 pm

response and another problem

Post by ut0mt8 »

For my problem, as far I understand the code, the return code of the rrd update process must be ok, if not the entry were never removed ?
I think this case may appen, specially in my setup wich is rather complex.
Running poller without boost result in many error, but it was "normal" since some snmp result was truncated, or not responding, etc...
My network is quite large and I have no time to fix all the problems on my devices.
So I think a lillte periodic clean up of the cache table may help me.
Any idea how to implement this ?
for now I just comment all the line with ok_to_delete = FALSE in setup.php.
Very ugly but It seems to works.

I have another problem with the boost_server.php which hang sometime, eating all my CPU. A rapid strace in the process indicate a timeout on select ?!
So I have another question :
assuming I run the boost server on the same host running my apache web server, could it be simple to bypass the boost_server and call directly the proper function in graph.php ? the boost server is designed for updating graph on demand on a mutli server envirnoment ? I am right.
In case of simple setup (single host) I think it may be more efficient to call directly the proper function in php.

Indeed I repeat that this is a great ! I will try to fix theses problems and share the result.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

So long as the apache user has write access to the RRD files, you may disable the boost_server.

It is apparent that I needed to perform additional testing. I think you are correct that we should remove the records despite the return code if we can assume that you have resolved permissions issues first.

It may be a few weeks before I can revisit this issue however.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
msw1970
Cacti User
Posts: 206
Joined: Tue Jan 09, 2007 8:28 am
Location: London, UK

Post by msw1970 »

Hi

I've followed the instructions for running the boost server as a service on RHEL4. It appears to add the service ok because if I run chkconfig --list cacti_rrdsvc it returns as being on in runlevels 2 to 5. However when I try to start the service with service cacti_rrdsvc start command I get the following back

env: /etc/init.d/cacti_rrdsvc: No such file or directory.

Any ideas??
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

All files that should be executed have the chmod ...+x set?
Reinhard
Mikkel
Posts: 32
Joined: Thu May 12, 2005 12:41 am

Post by Mikkel »

I'd like to try this Boost plugin, but I need plugin architecture 1.1 in order to do that. However, I can't find it anywhere. Where did you guys get it?
Locked

Who is online

Users browsing this forum: No registered users and 2 guests