Cacti takes 1min to poll/update 247 datasources

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
enrique.belo
Posts: 24
Joined: Mon Jun 14, 2010 1:51 pm

Cacti takes 1min to poll/update 247 datasources

Post by enrique.belo »

Hi all,

Please help me!
I am having problem with my new cacti installation, the performance is very slow. Here is the scenario
1. cmd.php with 1 process, polling time = 59secs
2. cmd.php with 2 priocess, polling time = 54 secs
3. spine with 2 threads = 49secs
4. spine with 4 threads = 47secs

I am using script data query for all of the devices. At first, I thought it was a device issue and it is responding slow to snmp queries,
-so I created different scripts that will poll the devices and will save the output to a mysql table
-make my script data query to get the data in the mysql table instead of polling the devices.
But still the polling time was only decreased by 4-6secs. Also I noticed that my script that polls the device only takes 10secs to finish, so it means that the devices are responding quick to snmp queries.

I cannot complete adding my devices as this would make cacti to reach the 300sec margin with this situation. I have all in all 2000 graphs to configure.

My system specs (dedicated to cacti):
Centos 5 - 1Gb of RAM - Intel(R) Pentium(R) 4 CPU 2.80GHz single core
cacti v0.8.7e - PIA 2.6
rrdtool-1.4.4-1.el5.rf
php-5.1.6-27.el5
mysql-5.0.77-4.el5_5.3
httpd-2.2.3-43.el5.centos.3

Below is the sample log of my cacti:
10/28/2010 01:36:01 AM - SYSTEM STATS: Time:59.5626 Method:cmd.php Processes:1 Threads:N/A Hosts:13 HostsPerProcess:13 DataSources:247 RRDsProcessed:126
10/28/2010 01:31:01 AM - SYSTEM STATS: Time:59.9065 Method:cmd.php Processes:1 Threads:N/A Hosts:13 HostsPerProcess:13 DataSources:247 RRDsProcessed:126
10/28/2010 01:25:59 AM - SYSTEM STATS: Time:57.7945 Method:cmd.php Processes:1 Threads:N/A Hosts:13 HostsPerProcess:13 DataSources:247 RRDsProcessed:126

Here is my top results: *PHP mysqld increases up to 40%.
top - 23:55:04 up 10 days, 6:45, 1 user, load average: 0.39, 0.53, 0.57
Tasks: 123 total, 4 running, 119 sleeping, 0 stopped, 0 zombie
Cpu(s): 84.4%us, 14.9%sy, 0.0%ni, 0.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1026972k total, 854852k used, 172120k free, 253760k buffers
Swap: 4096440k total, 212k used, 4096228k free, 320880k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18125 dnoc 18 0 25148 11m 5192 S 6.5 1.2 0:00.18 php
18127 dnoc 18 0 25096 11m 5176 S 6.5 1.2 0:00.18 php
18180 dnoc 20 0 22656 9404 4832 R 4.7 0.9 0:00.13 php
3212 mysql 15 0 143m 22m 4536 S 4.0 2.2 20:06.49 mysqld
18122 dnoc 18 0 25656 12m 5560 S 1.5 1.3 0:00.20 php
17302 root 15 0 2416 1028 800 R 0.4 0.1 0:01.85 top
18184 dnoc 19 0 19772 6016 4260 R 0.4 0.6 0:00.01 php
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: Cacti takes 1min to poll/update 247 datasources

Post by gandalf »

That's indeed slow. Please use 12...15...20 threads instead and retry.
Using scripts will decrease the effect that spine usually has on reducing polling time; scripts are very slow compared to pure snmp queries. But of course there are situations, were scripts can't be avoided.
R.
enrique.belo
Posts: 24
Joined: Mon Jun 14, 2010 1:51 pm

Re: Cacti takes 1min to poll/update 247 datasources

Post by enrique.belo »

sigh! I wonder how to delete that recent post, thread highjackers should be banned here.

Anyway, thanks gandalf for your reply, I followed your recommendations and here are the results :

12 threads -> for 14 hours running, I noticed a total of 64 NIFTY POPEN timed out errors and average poll time is 46secs.
15 threads -> for 2 hours, I already noticed 5 NIFTY POPEN timed out errors and average poll time is 46.4secs
20 threads -> currently running but I'm pretty sure there will be lots of NIFTY POPEN timed out errors and 46.25secs poll time.

I also noticed that since the change to higher threads, the load average of the server is consistently near 3 which is I believe not a good sign.

I currently change the process to 2 and threads to 10 and will monitor its behavior. Number of script server is still at 5, and script server timeout I change to 30.

Is this not an RRD-PHP-APACHE compatibility issue? I read some posts regarding slow polling (of course mine was insane slow!) and the issue is with RRDTool. I cannot test downgrading RRDTool to 1.2.x since I am having problem downgrading with yum on CentOS5.
Please help!
enrique.belo
Posts: 24
Joined: Mon Jun 14, 2010 1:51 pm

Re: Cacti takes 1min to poll/update 247 datasources

Post by enrique.belo »

Hi,

I managed to downgrade RRDTool to version 1.2.23, and also done the mysql tweaks as adviced on other threads.
I got a 4secs improvement, but still this is slow for 247 graphs.
Below is my latest logs with spine and cmd poller (both has >2 load average on my box)
11/03/2010 03:50:46 PM - SYSTEM STATS: Time:44.9113 Method:cmd.php Processes:2 Threads:N/A Hosts:13 HostsPerProcess:7 DataSources:247 RRDsProcessed:124
11/03/2010 03:45:46 PM - SYSTEM STATS: Time:44.7457 Method:cmd.php Processes:2 Threads:N/A Hosts:13 HostsPerProcess:7 DataSources:247 RRDsProcessed:124
11/03/2010 03:40:47 PM - SYSTEM STATS: Time:46.0242 Method:spine Processes:2 Threads:2 Hosts:13 HostsPerProcess:7 DataSources:247 RRDsProcessed:124
11/03/2010 03:35:46 PM - SYSTEM STATS: Time:44.8220 Method:spine Processes:2 Threads:2 Hosts:13 HostsPerProcess:7 DataSources:247 RRDsProcessed:124

Please advice what to do to further improve my cacti performance.

Thanks!
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: Cacti takes 1min to poll/update 247 datasources

Post by gandalf »

Again, are you mostly using scripts or snmp?
See e.g. my poller stats templates, published at the 4th link of my sig
R.
enrique.belo
Posts: 24
Joined: Mon Jun 14, 2010 1:51 pm

Re: Cacti takes 1min to poll/update 247 datasources

Post by enrique.belo »

Hi, I am using mostly scripts (index Data Script Query). This is because, the device that we monitor is not giving the index table, and data queries through SNMP is not posible.
I am considering migrating to Script Server but my converted script is not working, it always display U.
Attached is my scripts. Thanks.
Attachments
scripts.tar.gz
(2.13 KiB) Downloaded 71 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: Cacti takes 1min to poll/update 247 datasources

Post by gandalf »

Wow, that's the first I know who mainly runs scripts. That may well be the reason for slow polling. Script Server should help, cause each script now will load the whole php interpreter before being run. That surely will give bad performance
R.
enrique.belo
Posts: 24
Joined: Mon Jun 14, 2010 1:51 pm

Re: Cacti takes 1min to poll/update 247 datasources

Post by enrique.belo »

Hi, thanks for the info. I managed to get my cacti working through SNMP queries.
Initially, through SNMP query, my device would just respond with only 3 interfaces (that is the uplink interfaces). I created a script that would add the other 24 interfaces (downlink) using mysql queries to the tables in cacti database that handles the SNMP queries.
Now I am able to create my graphs, and cacti is working way faster than before.
11/12/2010 01:55:08 AM - SYSTEM STATS: Time:6.3585 Method:cmd.php Processes:2 Threads:N/A Hosts:13 HostsPerProcess:7 DataSources:243 RRDsProcessed:124
11/12/2010 01:50:07 AM - SYSTEM STATS: Time:5.4271 Method:cmd.php Processes:2 Threads:N/A Hosts:13 HostsPerProcess:7 DataSources:243 RRDsProcessed:124
11/12/2010 01:45:07 AM - SYSTEM STATS: Time:5.3649 Method:cmd.php Processes:2 Threads:N/A Hosts:13 HostsPerProcess:7 DataSources:243 RRDsProcessed:124

One thing I notice is that the values on my graphs exceeds the actual bandwidth of the interface.
I already tried the "rigid" and "autoscale min" of graph options but it seems to only cap the graph and still the values are displayed as it is.
Creating a maxvalue on the DS is not possible since downlink interfaces are with different bandwidths.

Please see attached screenshots for comparison of our existing MRTG vs. Cacti graphs.
Please note that mrtg uses 'bits' only as option, meaning its kilo=1000. Cacti also uses base value of 1000 and upper limit of 10000000 for 10Mbps bandwidth.

Also below is the graph debug :
/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=1289456069 \
--end=1289542469 \
--title="BAT-04 - Port 1 - BT032S2" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-min \
--upper-limit=10000000 \
COMMENT:"From 2010/11/11 14\:14\:29 To 2010/11/12 14\:14\:29\c" \
COMMENT:" \n" \
--vertical-label="bits per second" \
--slope-mode \
--font TITLE:12: \
--font AXIS:8: \
--font LEGEND:10: \
--font UNIT:8: \
DEF:a="/var/www/html/cacti/rra/bat-04_ifout_134.rrd":ifIn:AVERAGE \
DEF:b="/var/www/html/cacti/rra/bat-04_ifout_134.rrd":ifOut:AVERAGE \
CDEF:cdefa=a,8,* \
CDEF:cdefe=b,8,* \
CDEF:cdefba=a,10000000,/,100,*,8,* \
CDEF:cdefbe=b,10000000,/,100,*,8,* \
AREA:cdefa#00FF00FF:"IN" \
GPRINT:cdefa:LAST:" Current \:%8.2lf %s" \
GPRINT:cdefa:AVERAGE:"Average \:%8.2lf %s" \
GPRINT:cdefa:MAX:"Maximum \:%8.2lf %s\n" \
LINE1:cdefe#0000FFFF:"OUT" \
GPRINT:cdefe:LAST:"Current \:%8.2lf %s" \
GPRINT:cdefe:AVERAGE:"Average \:%8.2lf %s" \
GPRINT:cdefe:MAX:"Maximum \:%8.2lf %s\n" \
COMMENT:"******** IPDSLAM Bandwidth = 10Mbps ********\n" \
COMMENT:"IN" \
GPRINT:cdefba:LAST:" Current(%%)\:%8.2lf %s" \
GPRINT:cdefba:AVERAGE:"Average(%%)\:%8.2lf %s" \
GPRINT:cdefba:MAX:"Maximum(%%)\:%8.2lf %s\n" \
COMMENT:"OUT" \
GPRINT:cdefbe:LAST:"Current(%%)\:%8.2lf %s" \
GPRINT:cdefbe:AVERAGE:"Average(%%)\:%8.2lf %s" \
GPRINT:cdefbe:MAX:"Maximum(%%)\:%8.2lf %s\n"
Attachments
cacti graph
cacti graph
graph_image.php.png (57.99 KiB) Viewed 1679 times
mrtg graph
mrtg graph
bt032s2.JPG (42.99 KiB) Viewed 1679 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: Cacti takes 1min to poll/update 247 datasources

Post by gandalf »

You may want to use 1024 as a base. See the Graph Template to change that setting
R.
enrique.belo
Posts: 24
Joined: Mon Jun 14, 2010 1:51 pm

Re: Cacti takes 1min to poll/update 247 datasources

Post by enrique.belo »

Hi, actually my initial configuration for base value was 1024 and I found out that cacti is seeing traffic greater than the actual bandwidth (greater than 10Mbps in this case). So I tried to change the base value to 1000, but still same results.

Maybe we can tag this thread as SOLVED since my initial problem which is the polling time is already resolved. I'll just create a new thread regarding this value reading issue.

Thanks a lot gandalf!
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest