[HOWTO] Cacti's setup for really BIG environments

If you figure out how to do something interesting/cool in Cacti and want to share it with the community, please post your experience here.

Moderators: Developers, Moderators

BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

[HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

I will try to build a dedicated server for cacti in rather big environment.

Target:
  • 600+ hosts, 70 000+ data sources, 300 000+ data items
  • one week per-5-minutes statistics in RRA
  • 5 minutes poller interval
Hardware:
My hardware configuration is:
  • 2 x Quad-core Intel Xeon 5440
  • 16Gb RAM
  • 4x147G SAS 15k (RAID10 or RAID6)
After full tuning process CPU speed is to be a bottleneck. Since we are going to make rrd updates asynchronous with polling less expensive storage can be used. I use zfs raiz2 built on top of four disks.

Software:
MySQL
0) You will have to use dedicated MySQL instance for single database. I use MySQL daemon running on the same host that cacti runs.
1) Migrate to InnoDB to be able to use row locks.
2) Create indexes: Default cacti's scheme is lacking indexes. Aplly those mentioned in this tweak
3) Place full DB into RAM. That is, on memory disk. Since DB is used as configuration storage that is roughly constant and volatile storage of polled values it can be done. It will give a considerable boost both for webinterface and polling. 2...3Gb memory disk will be convenient for 300k data source items.
For FreeBSD recipe is to add following line into fstab

Code: Select all

md	/base	mfs	rw,-s3g,-m0,noatime	0	0
You'll have to setup two simple scripts for backup & restore. Backup script requires XtraBackup software:
  • backup

    Code: Select all

    #!/bin/sh
    PATH="/usr/local/bin:/sbin:/usr/sbin:/usr/local/sbin:/usr/bin:/bin"
    
    cdate="$(date -j '+%H-%M')";
    folder_date="$(date -j '+%Y-%m-%d')";
    
    backupdir="/opt/backup/cacti/$folder_date"
    [ -d $backupdir ] || mkdir -p $backupdir;
    latest_dump_filename="$backupdir/../db-latest.tbz"
    backup_filename="$backupdir/db-$cdate.tar"
    backup_memory_root="/backup-base"
    backup_memory_dir="$backup_memory_root/mysql"
    set -e
    date
    
    # stage 1. backup DB
    [ -d $backup_memory_dir ] && rm -rf $backup_memory_dir
    /usr/local/bin/innobackupex-1.5.1 --no-timestamp $backup_memory_dir
    /usr/local/bin/innobackupex-1.5.1 --apply-log $backup_memory_dir
    tar -C $backup_memory_root -cf - . | /usr/local/bin/7z a $backup_filename.bz2 -si -tbzip2 -mmt=5
    rm -rf $backup_memory_dir
    
    # stage 1a. make a latest-symlink for a fresh-generated tarball
    [ -h $latest_dump_filename ] && rm $latest_dump_filename
    ln -s $backup_filename.bz2 $latest_dump_filename
    
  • restore

    Code: Select all

    #!/bin/sh
    
    PATH="/usr/local/bin:/sbin:/usr/sbin:/usr/local/sbin:/usr/bin:/bin"
    
    backupdir=/var/backup/cacti
    latest_dump_filename="$backupdir/db-latest.tbz"
    
    echo "Restoring last MySQL dump"
    /usr/local/bin/7z x -so $latest_dump_filename | tar -C /base -xpf -
    
Make your rc.d(init.d) script to
  • restore latest SQL dump just after bringing up MySQL server during boot
  • backup DB just before stopping MySQL daemon
For FreeBSD this can be done using extra rc.d script

Code: Select all

>cat /usr/local/etc/rc.d/cacti_mysql
#!/bin/sh
#

# PROVIDE: cacti_mysql
# BEFORE: mysql

. /etc/rc.subr

name="cacti_mysql"
rcvar=`set_rcvar`

load_rc_config $name

: ${cacti_mysql_enable="NO"}

command="/path/to/cacti-mysql-unpack.sh"
command_args=""

run_rc_command "$1"
and some extra settings in rc.conf:

Code: Select all

cacti_mysql_enable="YES"
[ "X$_name" = "Xmysql" ] && {
        stop_precmd="sh /path/to/backup-cacti.sh"
}

#wait till cacti's mysql is dumped from memory to disk
rcshutdown_timeout="120"
4) Tune MySQLd to something like this:

Code: Select all

[mysqld]
skip-locking
key_buffer = 512M
query_cache_size = 128M
max_allowed_packet = 16M
table_cache = 512
sort_buffer_size = 128M
net_buffer_length = 8K
read_buffer_size = 1M
read_rnd_buffer_size = 32M
myisam_sort_buffer_size = 8M
max_heap_table_size = 4G
tmp_table_size=1G;
log_slow_queries
long_query_time = 2
log_long_format
innodb_buffer_pool_size = 256M
If you use binary logs make sure that poller_output* tables are ignored and binlogs are stored not at memory disk.
5) Convert poller_output to ENGINE=MEMORY (text fields can be converted to varchar(32...255)), poller_output_boost ENGINE=MyISAM ROW_FORMAT=FIXED (ROW_FORMAT has to be set in SQL dump in CREATE TABLE sequence, so you will have to perform dump|restore procedure with editing SQL dump). Make sure you have converted poller_output_boost.output into varchar(32...255) or ROW_FORMAT will be silently ignored by MySQL.
6)Put backup script into cron with 2-3 hours interval.

Cacti & Spine
  • turn off all max_execution_time, increase memory_limit to 1G or so in all cacti's scripts (`grep -R' will help)
  • install spine in honor of cmd.php, let it use 1.2...1.4x$no_cpus threads. In my case it is 10 threads.
  • install plugin architecture
  • install boost plugin. '1 hour' boost update interval is a good starting point.
  • apply patches patch1, patch2
Results:

Code: Select all

11/03/2008 06:36:17 PM - SYSTEM STATS: Time:76.5106 Method:spine Processes:1 Threads:10 Hosts:535 HostsPerProcess:535 DataSources:222959 RRDsProcessed:0
11/03/2008 06:31:14 PM - SYSTEM STATS: Time:73.7684 Method:spine Processes:1 Threads:10 Hosts:535 HostsPerProcess:535 DataSources:222959 RRDsProcessed:0
11/03/2008 06:26:05 PM - SYSTEM STATS: Time:64.7133 Method:spine Processes:1 Threads:10 Hosts:535 HostsPerProcess:535 DataSources:222959 RRDsProcessed:0
11/03/2008 06:21:52 PM - SYSTEM BOOST STATS: Time:1229.8073 RRDUpdates:2673120
11/03/2008 06:21:13 PM - SYSTEM STATS: Time:69.5011 Method:spine Processes:1 Threads:10 Hosts:535 HostsPerProcess:535 DataSources:222959 RRDsProcessed:0
11/03/2008 06:16:15 PM - SYSTEM STATS: Time:72.1974 Method:spine Processes:1 Threads:10 Hosts:535 HostsPerProcess:535 DataSources:222959 RRDsProcessed:0
Mean memory usage is about 9Gb, peak memory usage is 14-15Gb
Last edited by BorisL on Sun Jul 17, 2011 6:40 am, edited 13 times in total.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Wow, I'm impressed.
A few words on rrdtool? I would assume, you're using at least rrdtool 1.2.23 or up and a fadvise capable kernel?
Thanks for posting
Reinhard
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Post by BorisL »

>rrdtool
RRDtool 1.2.26 Copyright 1997-2007 by Tobias Oetiker <tobi@oetiker.ch>
Compiled Oct 8 2008 16:05:30

>uname -a
FreeBSD blah-blah 7.1-PRERELEASE FreeBSD 7.1-PRERELEASE #3: Thu Oct 9 15:30:19 MSD 2008 blah-blah:/usr/obj/usr/src/sys/SMP7 amd64
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Post by BorisL »

Here was my version of boost plugin. Those changes are in mainline boost now.
Last edited by BorisL on Fri Nov 26, 2010 6:20 am, edited 2 times in total.
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Post by BorisL »

Found nasty performance of poller_output_boost when it is in InnoDB and output is in TEXT format: poller spent up to 70 seconds (!!) copying data from poller_output to poller_output_boost.

Proper fix:
Convert poller_output_boost to ENGINE=MyISAM ROW_FORMAT=FIXED.
ROW_FORMAT has to be set in SQL dump in CREATE TABLE sequence, so you will have to perform dump|restore procedure with editing SQL dump.
Make sure you have converted poller_output_boost.output into varchar(32...255) or ROW_FORMAT will be silently ignored by MySQL.

Code: Select all

 11/15/2008 09:21:53 AM - SYSTEM STATS: Time:113.0797 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:26:52 AM - SYSTEM STATS: Time:108.1915 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:31:52 AM - SYSTEM STATS: Time:107.9162 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:36:58 AM - SYSTEM STATS: Time:115.4306 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:38:21 AM - SYSTEM BOOST STATS: Time:1281.4722 RRDUpdates:2733708
11/15/2008 09:41:37 AM - SYSTEM STATS: Time:95.7932 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:47:13 AM - SYSTEM STATS: Time:132.1431 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:51:59 AM - SYSTEM STATS: Time:119.0232 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 09:57:03 AM - SYSTEM STATS: Time:122.6083 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
<<======== altering poller_output_boost format ===========>>
11/15/2008 10:01:10 AM - SYSTEM STATS: Time:69.6723 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 10:13:53 AM - SYSTEM STATS: Time:67.4149 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 10:21:04 AM - SYSTEM STATS: Time:63.3600 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 10:26:12 AM - SYSTEM STATS: Time:71.8881 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
11/15/2008 10:31:20 AM - SYSTEM STATS: Time:76.0856 Method:spine Processes:1 Threads:10 Hosts:545 HostsPerProcess:545 DataSources:227815 RRDsProcessed:0
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Post by BorisL »

I have reuploaded boost.patched.tgz due to bug with poller_output_snap table.
star3am
Posts: 23
Joined: Mon Aug 04, 2008 5:08 am
Location: Cape Town

Nice

Post by star3am »

Nice one ! Thanks for the tips, I'm sure they will come in handy for many people :)
koaps
Posts: 12
Joined: Thu Feb 15, 2007 1:37 pm

Post by koaps »

One thing we do in our large ass environment is try to use snmptable whenever possible.

This can speed things up greatly and put a lot less load on your polled devices.
dononeil
Cacti User
Posts: 194
Joined: Wed Aug 06, 2008 4:45 pm

Post by dononeil »

Ok all... I need some help with my envorionment, and since I'm running newer code I didn't want to make any changes until everyone had a chance to review it.

Server:
2x Dual Core Xeon 3 Ghz (4 cores total)
8G RAM
Raid 0+1 System disk (128G)
Brocade SAN (cacti and MySQL files are located here)

My.cnf:

[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
skip-locking
key_buffer = 256M
max_allowed_packet = 16M
table_cache = 512
sort_buffer_size = 256M
read_buffer_size = 256M
read_rnd_buffer_size = 256M
myisam_sort_buffer_size = 256M
thread_cache_size = 32
query_cache_size= 256M
# Try number of CPU's*2 for thread_concurrency
thread_concurrency = 8
max_connections = 1024
max_connect_errors=10000
tmp_table_size = 256M
max_heap_table_size = 256M
innodb_buffer_pool_size = 256M

Cacti 97d
PIA 2.4
Boost 2.3-2
Spine 87c
OS, SLES10.2 SP2, 64 Bit
RRDTool 1.3.6
MySQL 5.1.31
PHP 5.2.5 w/ Suhosin & Zend 2.2
Apache 2.2.3

SYSTEM BOOST STATS: Time:242.3461 RRDUpdates:1026532
SYSTEM STATS: Time:283.8966 Method:spine Processes:2 Threads:20 Hosts:692 HostsPerProcess:346 DataSources:54038

2 Poller processes (tired 4, didn't make a difference)
20 Threads (tried 10, didn't make a difference)
10 PHP servers
Script time out; 60 (still get the occasional timeout)
Max OID: 50
Balance process load checked
5 minute polling interval
1M records before boost updates
4 hour update window
512k max Mysql insert string
4096 max arg length
Running boost server as root
On demand RRD update enabled.

That's the basic setup. I read a post somewhere that you could create indexes on some tables and it would improve performance. But I'm not sure if that still applies to 87d.

ANY assistance would be greatly appreciated. Unfortunatly due to limited resources I can't add more RAM in the short term on the server to move the tables into RAM, so that option is out.

I'm trying to find out where the logjam is, I suspect it's either MySQL or the fact that spine and all the system libraries are on the internal disks, and they're SLOW (12 MB/second benched), vs the SAN (75 MB/sec benched)

My Mysqld process is always consuming 100% CPU, so thats where I think I should really be focusing.

At one point I had moved MySQL off to another blade, but the performance was actually getting worse.

Any ideas?
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Post by BorisL »

Found another nasty bug worth patching

Fresh polling results:

Code: Select all

07/25/2009 10:21:15 PM - SYSTEM STATS: Time:73.6227 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:26:26 PM - SYSTEM STATS: Time:83.7507 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:31:16 PM - SYSTEM STATS: Time:74.9934 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:36:16 PM - SYSTEM STATS: Time:74.8144 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:41:14 PM - SYSTEM STATS: Time:72.8800 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:46:20 PM - SYSTEM STATS: Time:77.5879 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:51:19 PM - SYSTEM STATS: Time:76.9277 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
07/25/2009 10:56:18 PM - SYSTEM STATS: Time:76.0922 Method:spine Processes:1 Threads:8 Hosts:910 HostsPerProcess:910 DataSources:411429 RRDsProcessed:0
x2 data items, same polling time. Great! :P
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Post by BorisL »

Boost v3.0, new disk storage (14 SAS 15k @ RAID10), dual Intel Xeon 5530, 48Gb RAM:

Code: Select all

SYSTEM STATS: Time:72.1848 Method:spine Processes:1 Threads:8 Hosts:1009 HostsPerProcess:1009 DataSources:466603 RRDsProcessed:0
RRDUpdates:10428064 TotalTime:1657.1546 range_local_data_id:238.91 rcaston_add:69.74 get_records:168.61 results_cycle:613.58 rrd_path:32.68 rrd_template:70.58 rrd_lastupdate:8.69 rrd_field_names:92.15 rrdupdate:264.66 delete:445.08
xefil
Cacti User
Posts: 233
Joined: Tue Jun 20, 2006 2:48 am
Location: Italy
Contact:

Post by xefil »

BorisL wrote:Boost v3.0, new disk storage (14 SAS 15k @ RAID10), dual Intel Xeon 5530, 48Gb RAM:

Code: Select all

SYSTEM STATS: Time:72.1848 Method:spine Processes:1 Threads:8 Hosts:1009 HostsPerProcess:1009 DataSources:466603 RRDsProcessed:0
RRDUpdates:10428064 TotalTime:1657.1546 range_local_data_id:238.91 rcaston_add:69.74 get_records:168.61 results_cycle:613.58 rrd_path:32.68 rrd_template:70.58 rrd_lastupdate:8.69 rrd_field_names:92.15 rrdupdate:264.66 delete:445.08
WOW
tosage
Cacti User
Posts: 164
Joined: Wed Jul 28, 2010 5:05 am
Location: France

Re: [HOWTO] Cacti's setup for really BIG environments

Post by tosage »

Hello,

where i can download the v3.0 of boost please :)
i have a "large installation" with 3100 hosts.

11/05/2010 09:44:59 PM - SYSTEM STATS: Time:162.138 Method : spine Processes:200 Threads:30 Hosts:3099 HostsPerProcess:16 DataSources:56011 RRDsProcessed:27330

I seek good tuning for bring down my system stats!

My Architecture :
Two E5620 Quad Core (2,4Ghz)
8Gb RAM 1066Mhz
Raid 1 for the / of my ubuntu server
Raid 10 for the /srv/cacti/rra and the /srv/mysql

Thanks a lot for your advise :)
Cacti Version - 0.8.8a
Plugin Architecture - 3.1
Poller Type - spine
Server Info - Linux
Web Server - Apache/2.2.22 (Ubuntu)
PHP - 5.3.10-1ubuntu3.6 with Suhosin-Patch (cli)
MySQL - 5.5.29-0ubuntu0.12.04.2
RRDTool - 1.4.7
BorisL
Cacti User
Posts: 131
Joined: Sat Mar 31, 2007 9:21 am
Contact:

Re: [HOWTO] Cacti's setup for really BIG environments

Post by BorisL »

tosage wrote:where i can download the v3.0 of boost please
You may use current stable boost version.
tosage
Cacti User
Posts: 164
Joined: Wed Jul 28, 2010 5:05 am
Location: France

Re: [HOWTO] Cacti's setup for really BIG environments

Post by tosage »

I have found the plugin in my multiple search :)

Thanks for your answer !
Cacti Version - 0.8.8a
Plugin Architecture - 3.1
Poller Type - spine
Server Info - Linux
Web Server - Apache/2.2.22 (Ubuntu)
PHP - 5.3.10-1ubuntu3.6 with Suhosin-Patch (cli)
MySQL - 5.5.29-0ubuntu0.12.04.2
RRDTool - 1.4.7
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests