Performance Issues with CactiEZ
Moderators: Developers, Moderators
Performance Issues with CactiEZ
Hello,
We have Performance issues with our CactiEZ(Cacti 0.8.7c). We are not able to determine where the problems are.
We have tried increasing the ammount of PHP Memory, tweak MySQL settings, increase several Polling settings and have tried the Boost Plugin with no luck.
Our Problem: We have every now and then Poller runtime Timeouts (58secs) and entries like Poller tables not being empty. I have attached a few log snips as a reference.
I have also attached the logfile when we enabled boost. The problem with boost was that it did not update all rrds. On-demand rrd update was also not working (entirely) and localhost was not being updated at all. We had boost running for about 40 minutes the first time and 15 minutes the second time with an update interval of 30 mins.
CLI on the server is lagging as well...
Attached is also our my.cnf.
Our Current Cacti Settings:
Poller:
Poller: Spine
Poller Interval: 1 Min
Cron Interval: 1 Min
Maximum Concurrent Poller Processes: 4
Balance Process Load: Yes
Maximum Threads per Process: 100
Number of PHP Script Servers: 9
Script and Script Server Timeout Value: 5
The Maximum SNMP OID's Per SNMP Get Request: 40
php.ini:
memory_limit = 512M ; Maximum amount of memory a script may consume (8MB)
MacTrack Device Tracking is also enabled and runs every 30 Mins. About 250 Devices are configured and polltime is 200-250 secs (in average).
Let me know if any further information is required!
Our Hardware is a HP DL380 with 2xDualCore Xeon 2.7Ghz, 8GB RAM, RAID 1 72GB.
System load is between (lowest) 2.6 and more than 8 (when mactrack is polling).
Any help to improve performance is greatly appreciated.
We have Performance issues with our CactiEZ(Cacti 0.8.7c). We are not able to determine where the problems are.
We have tried increasing the ammount of PHP Memory, tweak MySQL settings, increase several Polling settings and have tried the Boost Plugin with no luck.
Our Problem: We have every now and then Poller runtime Timeouts (58secs) and entries like Poller tables not being empty. I have attached a few log snips as a reference.
I have also attached the logfile when we enabled boost. The problem with boost was that it did not update all rrds. On-demand rrd update was also not working (entirely) and localhost was not being updated at all. We had boost running for about 40 minutes the first time and 15 minutes the second time with an update interval of 30 mins.
CLI on the server is lagging as well...
Attached is also our my.cnf.
Our Current Cacti Settings:
Poller:
Poller: Spine
Poller Interval: 1 Min
Cron Interval: 1 Min
Maximum Concurrent Poller Processes: 4
Balance Process Load: Yes
Maximum Threads per Process: 100
Number of PHP Script Servers: 9
Script and Script Server Timeout Value: 5
The Maximum SNMP OID's Per SNMP Get Request: 40
php.ini:
memory_limit = 512M ; Maximum amount of memory a script may consume (8MB)
MacTrack Device Tracking is also enabled and runs every 30 Mins. About 250 Devices are configured and polltime is 200-250 secs (in average).
Let me know if any further information is required!
Our Hardware is a HP DL380 with 2xDualCore Xeon 2.7Ghz, 8GB RAM, RAID 1 72GB.
System load is between (lowest) 2.6 and more than 8 (when mactrack is polling).
Any help to improve performance is greatly appreciated.
- Attachments
-
- my.cnf.zip
- Our MySQL my.cnf - had to zip it to upload it!
- (688 Bytes) Downloaded 66 times
-
- Poller output table not empty.txt
- (16.06 KiB) Downloaded 42 times
-
- Maximum runtime exceeded.txt
- (89.74 KiB) Downloaded 43 times
-
- boost stats.txt
- Boost enabled log file
- (73.45 KiB) Downloaded 49 times
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Performance Issues with CactiEZ
Please lower the threads to e.g. 10-20.
In case you have massive PHP Server scripts, you can keep that value. Else lower this as well (it's per process! So you will now start 18 PHP processes while polling)
R.
In case you have massive PHP Server scripts, you can keep that value. Else lower this as well (it's per process! So you will now start 18 PHP processes while polling)
R.
Re: Performance Issues with CactiEZ
Hi,
Thank you for the quick answer!
I am afraid I am still having problems.
I tried lowering to 4 Processes and 15 Threads and I was still getting runtime timeouts.
Now I have lowered it even more to 2/15 and I still get:
I have tried to copy some "recommended" my.cnf values I found here and there with no success:
I would greatly appreciate any pointing in the right direction as there is confusing information on the web regarding this problem.
Thank you for the quick answer!
I am afraid I am still having problems.
I tried lowering to 4 Processes and 15 Threads and I was still getting runtime timeouts.
Now I have lowered it even more to 2/15 and I still get:
Code: Select all
02/23/2012 02:38:20 PM - SYSTEM STATS: Time:78.9892 Method:spine Processes:2 Threads:15 Hosts:336 HostsPerProcess:168 DataSources:15820 RRDsProcessed:4460
02/23/2012 02:38:20 PM - POLLER: Poller[0] Maximum runtime of 58 seconds exceeded. Exiting.
Code: Select all
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
old_passwords=1
log-slow-queries = /var/lib/mysql/slowqueries.log
long_query_time = 2
log_long_format
innodb_buffer_pool_size = 256M
skip-locking
sort_buffer_size = 128M
net_buffer_length = 16K
read_buffer_size = 1M
read_rnd_buffer_size = 32M
myisam_sort_buffer_size = 8M
tmp_table_size=1G;
skip-external-locking
key_buffer = 1280M
key_buffer_size = 1280M
max_allowed_packet = 64M
thread_stack = 768K
thread_cache_size = 24
max_connections = 500
query_cache_limit = 1024M
query_cache_size = 512M
table_cache = 1200
max_heap_table_size = 1792M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Performance Issues with CactiEZ
Well, depending on the disk performance, it may happen that rrdtool updates are slowing things down. In that case, BOOST plugin will help.
The judge, you may want to monitor the Cacti Host for "Detailed CPU Usage" which e.g. shows the "I/O Wait" percentages.
R.
The judge, you may want to monitor the Cacti Host for "Detailed CPU Usage" which e.g. shows the "I/O Wait" percentages.
R.
Re: Performance Issues with CactiEZ
Hi,
I was thinking the same thing, and have setup the CPU Monitoring, as well as diskIO etc.
Result: Disk IO is high and 1 out of 4 CPUs are 100% IO. (attached)
We have tried enabling Boost, but have had a bad experience. Some graphs were not being updated, etc.
I will re-enable Boost and post results...
I was thinking the same thing, and have setup the CPU Monitoring, as well as diskIO etc.
Result: Disk IO is high and 1 out of 4 CPUs are 100% IO. (attached)
We have tried enabling Boost, but have had a bad experience. Some graphs were not being updated, etc.
I will re-enable Boost and post results...
Re: Performance Issues with CactiEZ
Hi,
found out that some graphs were not being updates because of access rights to rrds. I fixed that and now all graphs are being updated by boost.
Disk IO and CPU IO have dropped to an acceptable level (attached).
Problem remains though. (logfile attached).
Is this a problem of CactiEZ?
found out that some graphs were not being updates because of access rights to rrds. I fixed that and now all graphs are being updated by boost.
Disk IO and CPU IO have dropped to an acceptable level (attached).
Problem remains though. (logfile attached).
Is this a problem of CactiEZ?
Re: Performance Issues with CactiEZ
the attached logfile should also be of interest... system stats droped from 34 to 14 seconds but poller output table has not been emptied...
Any help is greatly appreciated.
BTW, I have chowned the /var/www/html/rra directory and its contents to apache:apache
and also chmod'ed +rw.
Any help is greatly appreciated.
BTW, I have chowned the /var/www/html/rra directory and its contents to apache:apache
and also chmod'ed +rw.
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Performance Issues with CactiEZ
So, the "cactiuser" now is not able to update those rrd files any more???
R.
R.
Re: Performance Issues with CactiEZ
Hi,
The cactiuser does not exist in CactiEZ...
This makes it all confusing. The user Apache is used for updating with boost.
The cactiuser does not exist in CactiEZ...
This makes it all confusing. The user Apache is used for updating with boost.
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Performance Issues with CactiEZ
k
Not used to CactiEZ (perhaps too easy)
And where's the log?
R.
Not used to CactiEZ (perhaps too easy)
And where's the log?
R.
Re: Performance Issues with CactiEZ
Hmmm.. forgot to attach the log :S
Here you go.
Here you go.
- Attachments
-
- Poller output table not empty.txt
- (16.06 KiB) Downloaded 43 times
Re: Performance Issues with CactiEZ
I actually fogot to attach both logs. Sorry...
I have also seen that I sometimes have great stats (27-29 secs) and suddenly the go to 55-58 secs or until the poller quits.
I have attached a little screenshot where you can see this (I couldn't catch the whole thing with one screenshot, so I just made a screenshot of the 2 Polling time differences).
I have managed to lower Disk and CPU IO to a minimum and Cacti is working better although it is not able to poll 336 Hosts within 58 secs.
I have also seen that I sometimes have great stats (27-29 secs) and suddenly the go to 55-58 secs or until the poller quits.
I have attached a little screenshot where you can see this (I couldn't catch the whole thing with one screenshot, so I just made a screenshot of the 2 Polling time differences).
I have managed to lower Disk and CPU IO to a minimum and Cacti is working better although it is not able to poll 336 Hosts within 58 secs.
- Attachments
-
- screenshot.zip
- (214.67 KiB) Downloaded 44 times
-
- Maximum runtime exceeded.txt
- (89.74 KiB) Downloaded 45 times
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Performance Issues with CactiEZ
You've got SQL errors due to missing escaping of strings by cacti:mharald wrote:Hmmm.. forgot to attach the log :S
Here you go.
Which reindex method do you use for that Data Query?02/22/2012 09:48:02 AM - SPINE: Poller[0] ERROR: SQL Failed! Error:'1064', Message:'You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'Inside' interface' WHERE host_id='26' AND data_query_id='1' and arg1='.1.3.6.1.2' at line 1', SQL Fragment:'UPDATE poller_reindex SET assert_value='Adaptive Security Appliance 'Inside' interface' WHERE host_id='26' AND data_query_id='1' and arg1='.1.3.6.1.2.1.2.2.1.2.3''
R.
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Performance Issues with CactiEZ
As you're using spine, you may want to switch verbosity to level 3. This way, spine will record the polling duration for each host (amon others). And in case SYSTEM STATS max'es out, you may see the host that's causing this issue
R.
R.
Re: Performance Issues with CactiEZ
The SQL errors are known and will be taken care of. Wrong reindexing.
Regarding the spine time-outs: Are you saying that these are being caused because of hosts not answering or answeing slow?
I now have 4 Processes with 20 Threads each. If, say, 20 Hosts have bigger latency... This should not affect Cacti.
If Cacti really is so affected by latency, can it be configured in a way so this doesn't affect the polling time in such a big way?
A polling time difference between 26 secs and 58 secs for aprox. 350 hosts is huge!
We were thinking of dumping cactiez and building a 2-hardware-server cacti. 1 for the polling, 1 for mysql and mactrack.
I still need a solution for the latency... is a 1 min poll to small?
Regarding the spine time-outs: Are you saying that these are being caused because of hosts not answering or answeing slow?
I now have 4 Processes with 20 Threads each. If, say, 20 Hosts have bigger latency... This should not affect Cacti.
If Cacti really is so affected by latency, can it be configured in a way so this doesn't affect the polling time in such a big way?
A polling time difference between 26 secs and 58 secs for aprox. 350 hosts is huge!
We were thinking of dumping cactiez and building a 2-hardware-server cacti. 1 for the polling, 1 for mysql and mactrack.
I still need a solution for the latency... is a 1 min poll to small?
Who is online
Users browsing this forum: No registered users and 2 guests