[HOWTO] Cacti's setup for really BIG environments

If you figure out how to do something interesting/cool in Cacti and want to share it with the community, please post your experience here.

Moderators: Developers, Moderators

wijdan135
Posts: 1
Joined: Fri Dec 10, 2010 6:02 pm

Re: [HOWTO] Cacti's setup for really BIG environments

Post by wijdan135 »

please explain me how to "place full db into ram" :)
Thanx in advance
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

wijdan135 wrote:please explain me how to "place full db into ram" :)
Thanx in advance
Make mysql to store it's files on memory-mapped disk. See option --datadir.
tosage
Cacti User
Posts: 164
Joined: Wed Jul 28, 2010 5:05 am
Location: France

Re: [HOWTO] Cacti's setup for really BIG environments

Post by tosage »

@BorisL

I have a question for you Boris, how do you do for have a RRDsProcessed to 0 ? :roll:
Cacti Version - 0.8.8a
Plugin Architecture - 3.1
Poller Type - spine
Server Info - Linux
Web Server - Apache/2.2.22 (Ubuntu)
PHP - 5.3.10-1ubuntu3.6 with Suhosin-Patch (cli)
MySQL - 5.5.29-0ubuntu0.12.04.2
RRDTool - 1.4.7
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

tosage wrote:I have a question for you Boris, how do you do for have a RRDsProcessed to 0 ? :roll:
It's Boost's job. No RRDs are update during polling cycle - they are flush asynchronously instead.
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

Ultimate-speed boost or best filesystem for RRD database.

The best FS for huge amount of RRD files is ZFS. If you have enough RAM you can tune ZFS to store RRD files very effectively. Rrdtool 'mmap's 4k region of RRD file in order to update it. This piece of file should be cached and rest of file should not. Doing so will speed up update speed as OS reads this block from RAM and ZFS has enough intelligence to schedule HDD flush process effectively.

The only thing you need to tune is recordsize parameter: setting it to 4k will force ZFS to treat all RRD files as 4k sets of blocks. Proper tuning of this parameter will speed up boost flush process a lot: 250k of RRD files placed on RAID10 @ 18 HDD SAS 15k are flushed within 50-60 seconds with warm cache (second and subsequent poller_boost.php runs) consuming ~38G RAM.

If you have your RRD files on ZFS now and recordsize is not 4k, you will need to tune recordsize and then produce a copy using cp or similar. This step is critical since recordsize is used for new data written and old data is intact.

Mind that aside from boost speedup itself this tuning will lead to quicker graph plotting since less time is needed to flush data before drawing graph.
c3226026
Cacti User
Posts: 87
Joined: Mon Jan 17, 2011 12:15 pm

Re: [HOWTO] Cacti's setup for really BIG environments

Post by c3226026 »

Hello BorisL,

Coul you give us a "litle" howto for your new cacti optimized configuration (conf files ...) with ZFS, I'm very interesting, and I think, many other people so.

Thanks in advanced for your return.

Bests regards
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

It is almost all (except ZFS part) in the begining of the topic :P (it seems to be a miracle though). This concept still works perfectly: the only thing that now is changed is size of RAM disk for DB (15Gb). Everything rest is still as it was two+ years before.

Ah, there are two more patches to be applied against 087g:
#1973
#1975

And stay tuned with boost plugin: there will be boost v5.0 soon, it is optimized for huge poller_output_boost tables:

Code: Select all

RRDUpdates:24784489 TotalTime:1172 get_records:408.91 results_cycle:430.01 rrd_filename_and_template:120.96 rrd_lastupdate:19.68 rrdupdate:120.02 delete:297.52 timer_overhead:~12
c3226026
Cacti User
Posts: 87
Joined: Mon Jan 17, 2011 12:15 pm

Re: [HOWTO] Cacti's setup for really BIG environments

Post by c3226026 »

OK thanks BorisL, I've saw the begin of this topic (many thanks for your works), my question was only for zfs addon part.

Do you know when boost 5.0 will be available ?

Thanks
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

ZFS part is straightforward: use ZFS with atime=off and recordsize=4k for RRD storage on server with lots and lots of RAM. That's all. I didn't tune anything else.

They say boost v5.0 will be committed into trunk the other day. I've completed it a week ago.
c3226026
Cacti User
Posts: 87
Joined: Mon Jan 17, 2011 12:15 pm

Re: [HOWTO] Cacti's setup for really BIG environments

Post by c3226026 »

Thanks for these news, I'm waiting for the trunk new version (not available actually).
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

Just some status refreshment.
Switched to low-delay RRD flushing:
RRDUpdates:4279824 TotalTime:379 get_records:78.09 results_cycle:238.55 rrd_filename_and_template:130.58 rrd_lastupdate:19.09 rrdupdate:29.48 delete:53.13 timer_overhead:~14
(once per 30 minutes, 4M records limit)
SYSTEM STATS: Time:72.0522 Method:spine Processes:1 Threads:16 Hosts:2089 HostsPerProcess:2089 DataSources:1103960
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

Updated backup and restore scripts: now they use XtraBackup solution that requires less lock time for backup (locks only when backing MyISAM tables up). Moreover startup is now superb fast as soon it just unpacks full-data tarball before MySQL daemon starts.
tosage
Cacti User
Posts: 164
Joined: Wed Jul 28, 2010 5:05 am
Location: France

Re: [HOWTO] Cacti's setup for really BIG environments

Post by tosage »

Hello all, i have questions for one of my environment.

I have an Cacti 0.8.7e from Ubuntu Repository with Cacti Spine 0.8.7e from Repository too.
My system is an Ubuntu Server 10.04.1 x86
Isn't a huge environment report Wistof :), but it's :

08/28/2011 05:08:02 PM - SYSTEM STATS: Time:180.3788 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35765

This Cacti is hosted on one Dell PE R710 (BI Proc E5620 + 8Go DDR3) with a RAID1 (2*146Go 15k for system) and Raid10 (4*300Go 15k for MySQL and RRD)

So i have a problem, I can't understand why my installation was a poller time that varies so much :(

Code: Select all

08/28/2011 05:08:02 PM - SYSTEM STATS: Time:180.3788 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35765
08/28/2011 05:04:08 PM - SYSTEM STATS: Time:247.0113 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35758
08/28/2011 04:57:54 PM - SYSTEM STATS: Time:172.6921 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35693
08/28/2011 04:52:56 PM - SYSTEM STATS: Time:175.2318 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35764
08/28/2011 04:47:55 PM - SYSTEM STATS: Time:173.4228 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35778
08/28/2011 04:42:56 PM - SYSTEM STATS: Time:173.8927 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35743
08/28/2011 04:37:56 PM - SYSTEM STATS: Time:175.2416 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35813
08/28/2011 04:34:10 PM - SYSTEM STATS: Time:247.9481 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35769
08/28/2011 04:28:01 PM - SYSTEM STATS: Time:180.3819 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35783
08/28/2011 04:23:24 PM - SYSTEM STATS: Time:203.1443 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35796
08/28/2011 04:19:47 PM - SYSTEM STATS: Time:286.1307 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35784
08/28/2011 04:15:35 PM - SYSTEM STATS: Time:334.2582 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35749
08/28/2011 04:11:27 PM - SYSTEM STATS: Time:385.0562 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35813
08/28/2011 04:06:35 PM - SYSTEM STATS: Time:393.3170 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35765
08/28/2011 03:57:55 PM - SYSTEM STATS: Time:173.1767 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35719
08/28/2011 03:52:55 PM - SYSTEM STATS: Time:174.0312 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35774
08/28/2011 03:47:58 PM - SYSTEM STATS: Time:176.5876 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35743
08/28/2011 03:43:03 PM - SYSTEM STATS: Time:182.5855 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35729
08/28/2011 03:37:58 PM - SYSTEM STATS: Time:177.5585 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35711
08/28/2011 03:34:19 PM - SYSTEM STATS: Time:257.9840 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35786
08/28/2011 03:28:13 PM - SYSTEM STATS: Time:192.3354 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35780
08/28/2011 03:23:02 PM - SYSTEM STATS: Time:180.1497 Method:spine Processes:2 Threads:10 Hosts:3420 HostsPerProcess:1710 DataSources:63241 RRDsProcessed:35796 
When the poller time increase, i have an increasement of my CPU but not of the disk IO
I have appli

If you have any suggestions, i thank you in advance
Cacti Version - 0.8.8a
Plugin Architecture - 3.1
Poller Type - spine
Server Info - Linux
Web Server - Apache/2.2.22 (Ubuntu)
PHP - 5.3.10-1ubuntu3.6 with Suhosin-Patch (cli)
MySQL - 5.5.29-0ubuntu0.12.04.2
RRDTool - 1.4.7
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

tosage wrote:If you have any suggestions, i thank you in advance
I have a bunch of suggestions.
Once you post you message in this thread you are strongly advised to use techniques described earlier in this thread before to ask questions :D.
Next, use atop to investigate which process is taking too much CPU.
tosage
Cacti User
Posts: 164
Joined: Wed Jul 28, 2010 5:05 am
Location: France

Re: [HOWTO] Cacti's setup for really BIG environments

Post by tosage »

Thanks for your answer Boris
So i done all yours advice that you explain in this topic except for the Boost plugin.

I apply the different patch that you provided on the topic and with yours enhancements i win more than 80 seconds of polling.

When my polling time increase, i will see mysqld which consume between 15 and 30 % of CPUs
Maybe a good restart of the server will be a solution :lol: (uptime > 320 days)

Now i think my last enhancement will be to setup the boost plugin on my installation ?

Thanks for your time :roll:
Cacti Version - 0.8.8a
Plugin Architecture - 3.1
Poller Type - spine
Server Info - Linux
Web Server - Apache/2.2.22 (Ubuntu)
PHP - 5.3.10-1ubuntu3.6 with Suhosin-Patch (cli)
MySQL - 5.5.29-0ubuntu0.12.04.2
RRDTool - 1.4.7
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests