My setup:
2 x Xeon 5060 (Dual-core 3.2GHz with Hyper-threading)
943 devices
40372 data sources
18644 graphs
Poller item stats: 38750 SNMP, 21 scripts, 1601 script server
Polling time average: 17 seconds.
Cacti 0.8.7g, spine & boost 4.0
The best polling performance that I get it's with 6 processes, 10 threads an 10 script servers.
Of course, 1-minute polling
Poller is run with:
/poller_log looks like this:nice --16 /usr/bin/php /home/cacti/poller.php -d 1>>/poller_log 2>&1
Code: Select all
03/31/2011 11:13:01 AM - POLLER: Poller[0] NOTE: Poller Int: '60', Cron Int: '60', Time Since Last: '59', Max Runtime '50', Poller Runs: '1'
03/31/2011 11:13:02 AM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /home/cacti/spine/spine, ARGS: 0 58]
03/31/2011 11:13:02 AM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /home/cacti/spine/spine, ARGS: 59 125]
03/31/2011 11:13:02 AM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /home/cacti/spine/spine, ARGS: 127 237]
03/31/2011 11:13:02 AM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /home/cacti/spine/spine, ARGS: 257 484]
03/31/2011 11:13:02 AM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /home/cacti/spine/spine, ARGS: 485 789]
03/31/2011 11:13:02 AM - POLLER: Poller[0] DEBUG: About to Spawn a Remote Process [CMD: /home/cacti/spine/spine, ARGS: 790 1079]
Waiting on 6 of 6 pollers.
Waiting on 6 of 6 pollers.
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'user:1701628' [map user->user]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'nice:5999' [map nice->nice]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'system:759667' [map system->system]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'idle:12273699' [map idle->idle]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'wait:250069' [map wait->wait]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'kernel:741807' [map kernel->kernel]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'interrupt:2669' [map interrupt->interrupt]
03/31/2011 11:13:04 AM - POLLER: Poller[0] Parsed MULTI output field 'softirq:15191' [map softirq->softirq]
[....]
03/31/2011 11:13:05 AM - POLLER: Poller[0] Parsed MULTI output field 'act:0' [map act->act]
03/31/2011 11:13:05 AM - POLLER: Poller[0] Parsed MULTI output field 'tot:34' [map tot->tot]
03/31/2011 11:13:05 AM - POLLER: Poller[0] Parsed MULTI output field 'rid:17' [map rid->rid]
03/31/2011 11:13:05 AM - POLLER: Poller[0] Parsed MULTI output field 'con:12' [map con->con]
[...............]
03/31/2011 11:13:18 AM - SYSTEM STATS: Time:16.6775 Method:spine Processes:6 Threads:10 Hosts:890 HostsPerProcess:149 DataSources:40251 RRDsProcessed:0
Loop Time is: 16.68
Sleep Time is: 43.29
Total Time is: 16.71
QUESTION: Are this acquired using spine or not? I'm asking because of
PROBLEM:
I want to graph packet loss / latency for every device.
I've added "loss/min/max/avg" columns in the 'host' table and I'm using a script that populates this values, independently of Cacti/spine poller.
I made a script that looks like this:
Code: Select all
<?php
/* do NOT run this script through a web browser */
if (!isset($_SERVER["argv"][0]) || isset($_SERVER['REQUEST_METHOD']) || isset($_SERVER['REMOTE_ADDR'])) {
die("<br><strong>This script is only meant to run at the command line.</strong>");
}
$no_http_headers = true;
/* display No errors */
error_reporting(0);
include_once(dirname(__FILE__) . "/../lib/snmp.php");
if (!isset($called_by_script_server)) {
include_once(dirname(__FILE__) . "/../include/global.php");
array_shift($_SERVER["argv"]);
print call_user_func_array("ss_deviceping", $_SERVER["argv"]);
}
function ss_deviceping($arg) {
$query="select * from host where hostname='".$arg."';";
$result = mysql_query($query);
$line=mysql_fetch_array($result);
echo "min:".$line['min'].' max:'.$line['max']." avg:".$line['avg']." loss:".$line['loss'];
}
?>
Now, if I add two graph templates to every device (for packet loss and RTT) and I create graphs for them:
- All of the data appears in /poller_log, so I think that maybe spine is not used for aquiring this data
- With an increase of ~ 1682 data sources (42054 vs 40372 DS, that's less than 5%) and 1774 script server items (vs 1601, that's a 110% increase) I also get a huge increase in polling time - 22 seconds vs 17 seconds, that's 30%.
The question is: What am I doing wrong?