I have 2 Cacti large scale setups. Each one have nearly 2000 devices and 250k data sources.
I am using boost and multiserver plugins, and the last few months I am experiencing problems with the DB performance. I have made a few changes, that dramaticaly change the performace of the database.
Cacti Poller Process
- I have added SQL_NO_CACHE statement on all SELECT queryies that dont need to be cached. Most of the Poller SELECT queries don't need to be cached. This way, MySQL has more cycle to process real data, and more cache memory to store queries that really need to be cached, such as Graph Permitions, Graph Tree, etc
- I have change the sleep time on the Poller.php main loop from 500uSec to 10000uSec. With 500uSec, there is a tremendus number of queries just query if a poller_process has ended the process. I prefered the latency than killing my db.
- On poller_boost.php. I have changed the way, boost_update is checking if remaining data are present on the temp table. I dont care who many remaining data are on the table, just that the are.
Code: Select all
#$rows = db_fetch_cell("SELECT count(*) FROM $archive_table"); $rows = db_fetch_cell("SELECT EXISTS(SELECT 10 FROM $archive_table)");
- When multiserver is enable, there is a join query beteen poller_output and poller_item that is filter on poller_id. But poller_id, by default is only present on poller_item table. This lead on huge tmp tables. I have altered poller_output table and added poller_id. I also changed spine to populate the poller_id on poller_output table and chagne the join query.
Code: Select all
/* create/update the rrd files */ //$results = db_fetch_assoc("select // poller_output.output, // poller_output.time, // UNIX_TIMESTAMP(poller_output.time) as unix_time, // poller_output.local_data_id, // poller_item.rrd_path, // poller_item.rrd_name, // poller_item.rrd_num // from (poller_output,poller_item) // where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name and poller_item.poller_id='$poller_server_id') // $limit"); /* change by vang!!! */ $results = db_fetch_assoc("SELECT SQL_NO_CACHE poller_output.output, poller_output.time, UNIX_TIMESTAMP(poller_output.time) as unix_time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num FROM poller_output LEFT JOIN poller_item ON poller_output.local_data_id=poller_item.local_data_id AND poller_output.rrd_name=poller_item.rrd_name WHERE poller_output.poller_id='$poller_server_id' $limit ");
- By default Cacti's function process_poller_output() does have the poller_id. With multiserver, this function needs to know the poller_id and hence it is querying the db for that info. I have altered the function to accept poller_id as input and comment out the queries.
On lib/poller.phpCode: Select all
#function process_poller_output(&$rrdtool_pipe ,$remainder = FALSE) { function process_poller_output(&$rrdtool_pipe,$poller_server_id ,$remainder = FALSE) { //$server_remote_name = gethostname(); //$poller_server_id = db_fetch_cell("SELECT SQL_NO_CACHE id FROM poller_server WHERE name ='$server_remote_name' AND aktive = 'on'");
- Mainly, after multiserver the db is lucking of right INDEXes, since most of the processes need the poller_id on the querys.
Vangelis