[FIXED] Polling exceeds 1 minute
Moderators: Developers, Moderators
[FIXED] Polling exceeds 1 minute
Hello everybody! As I am kind of new in Cacti and I request your help!
My problem is the following: my Cacti is configured to do polling every 1 minute but it takes now too long and for each polls, it's around 55-57 seconds so I am stuck because I cannot add new devices in my Cacti monitoring now although I don't have so many devices.
09/15/2009 11:06:54 AM - SYSTEM STATS: Time:54.9548 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
09/15/2009 11:05:54 AM - SYSTEM STATS: Time:54.9374 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
09/15/2009 11:04:56 AM - SYSTEM STATS: Time:56.6375 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
Is it normal that it takes so long (I have only 57 hosts and 603 datasources). Could it be due to my RDD configuration as I set it to do a 1 minute consolidation for the 2 last months and 5 minutes consolidation for one year (without average)?
Name Steps Rows Timespan**
2 months - 1 min 1 89280 5356800
1 year - 5 min 5 115200 33053184
Any recommendations or advices to tune my settings? Let me know if you need more information to help me on this issue. FYI, it's a prod server so I cannot play too much with the settings in order to avoid as much as I can any data loss.
My poller settings in attachment.
And my server info:
Operating System: Windows 2003 server
Webserver: Apache 2.2.11 (win32)
Cacti: 0.8.7b
Spine: 0.8.7a
MySQL: 2.11.4
PHP: 5.2.5
RRDTool: 1.2.26
Net-SNMP: 5.4.2.1
My problem is the following: my Cacti is configured to do polling every 1 minute but it takes now too long and for each polls, it's around 55-57 seconds so I am stuck because I cannot add new devices in my Cacti monitoring now although I don't have so many devices.
09/15/2009 11:06:54 AM - SYSTEM STATS: Time:54.9548 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
09/15/2009 11:05:54 AM - SYSTEM STATS: Time:54.9374 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
09/15/2009 11:04:56 AM - SYSTEM STATS: Time:56.6375 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
Is it normal that it takes so long (I have only 57 hosts and 603 datasources). Could it be due to my RDD configuration as I set it to do a 1 minute consolidation for the 2 last months and 5 minutes consolidation for one year (without average)?
Name Steps Rows Timespan**
2 months - 1 min 1 89280 5356800
1 year - 5 min 5 115200 33053184
Any recommendations or advices to tune my settings? Let me know if you need more information to help me on this issue. FYI, it's a prod server so I cannot play too much with the settings in order to avoid as much as I can any data loss.
My poller settings in attachment.
And my server info:
Operating System: Windows 2003 server
Webserver: Apache 2.2.11 (win32)
Cacti: 0.8.7b
Spine: 0.8.7a
MySQL: 2.11.4
PHP: 5.2.5
RRDTool: 1.2.26
Net-SNMP: 5.4.2.1
- Attachments
-
- spine.JPG (127.03 KiB) Viewed 2978 times
Last edited by Tibo on Thu Feb 04, 2010 5:57 am, edited 1 time in total.
And also try increasing Maximum Threads per Process to 10-15.
This document may help you.
http://docs.cacti.net/manual:087:3a_adv ... pine#spine
This document may help you.
http://docs.cacti.net/manual:087:3a_adv ... pine#spine
Hello again. I did a lot of tests this morning changing the poller settings and it's not faster. In the best cases, I have the same time but in some cases I lost 2 or 3 seconds (roughly).
I don't really understand why Spine cannot be faster as my environment is not very huge: 53 hosts with 910 datasources and 553 RRDs processed.
For your information, it's a good server based on a Intel Xeon 3.4 Ghz (4 CPU) with 4 GB of RAM and the OS is a Windows Server 2003 (standard edition with SP2).
I don't really understand why Spine cannot be faster as my environment is not very huge: 53 hosts with 910 datasources and 553 RRDs processed.
For your information, it's a good server based on a Intel Xeon 3.4 Ghz (4 CPU) with 4 GB of RAM and the OS is a Windows Server 2003 (standard edition with SP2).
Isn't there possibility that several of your device has the problem to return data?
As you say, I also think your environment is not too large for Spine.
To pinpoints the cause, tern cacti.log to DEBUG level for one polling cycle,
and confirm how much time it takes to collect information on each device.
As you say, I also think your environment is not too large for Spine.
To pinpoints the cause, tern cacti.log to DEBUG level for one polling cycle,
and confirm how much time it takes to collect information on each device.
Is it normal that there is around 15 sec between the Spine poller time and the system stats time?
09/18/2009 09:13:57 AM - SYSTEM STATS: Time:57.6000 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:912 RRDsProcessed:554
09/18/2009 09:13:41 AM - SPINE: Poller[0] Time: 41.3430 s, Threads: 6, Hosts: 53
09/18/2009 09:13:57 AM - SYSTEM STATS: Time:57.6000 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:912 RRDsProcessed:554
09/18/2009 09:13:41 AM - SPINE: Poller[0] Time: 41.3430 s, Threads: 6, Hosts: 53
-
- Cacti Guru User
- Posts: 1884
- Joined: Mon Oct 16, 2006 5:57 am
- Location: United Kingdom
- Contact:
You really should look at what you are polling. SPINE excels at SNMP queries, but scripts such as PERL and WMI will slow it down - particularly if you are querying a server in the Far East, when your Cacti box sits in the US (for example).Tibo wrote:Hello again. I did a lot of tests this morning changing the poller settings and it's not faster. In the best cases, I have the same time but in some cases I lost 2 or 3 seconds (roughly).
I don't really understand why Spine cannot be faster as my environment is not very huge: 53 hosts with 910 datasources and 553 RRDs processed.
For your information, it's a good server based on a Intel Xeon 3.4 Ghz (4 CPU) with 4 GB of RAM and the OS is a Windows Server 2003 (standard edition with SP2).
Have a look at your queries to see if you can get rid of what you don't really need. Another tool to look at is "pollperf" from Gandalf (search the forums) - this will tell you the runtime of each datasource and host during polling.
Cacti Version 0.8.8b
Cacti OS Ubuntu LTS
RRDTool Version RRDTool 1.4.7
Poller Information
Type SPINE 0.8.8b
-
- Cacti Guru User
- Posts: 1884
- Joined: Mon Oct 16, 2006 5:57 am
- Location: United Kingdom
- Contact:
Depends on what plugins are running after the poller has finished collecting data. One way to check would be to disable all plugins, and let the poller have the limelight - the time you get with no plugins is the time it takes for your poller to run.Tibo wrote:Is it normal that there is around 15 sec between the Spine poller time and the system stats time?
09/18/2009 09:13:57 AM - SYSTEM STATS: Time:57.6000 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:912 RRDsProcessed:554
09/18/2009 09:13:41 AM - SPINE: Poller[0] Time: 41.3430 s, Threads: 6, Hosts: 53
Cacti Version 0.8.8b
Cacti OS Ubuntu LTS
RRDTool Version RRDTool 1.4.7
Poller Information
Type SPINE 0.8.8b
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Set the log level to MEDIUM for one pass, and post your Cacti Log.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
In attachment the log for one pass in medium level (I just remove the IP for security reason ).
Let me know if you find something interesting.
Surprisingly, there is a gap of 15 sec between the poller time and the system time for one pass. I tried to disable all my plugin as I reported before but it didn't change anything.
Let me know if you find something interesting.
Surprisingly, there is a gap of 15 sec between the poller time and the system time for one pass. I tried to disable all my plugin as I reported before but it didn't change anything.
- Attachments
-
- cacti_log.txt
- Cacti logs
- (132.36 KiB) Downloaded 147 times
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Your issue might be I/O wait. Your spine finishes at 37 seconds, but the poller does not finish until just under the bell. So, there is quite a bit of processing delay.
Here is your plan of attack:
Step 0
Increase Spine Threads to 20. However, prior to doing that, increase "max_connections" to 200 in my.cnf and restart mysql.
Step 1
Make this change
Step 2
Send output to the following:
Step 3
Send output to the following:
Step 4
Send output to the following:
Step 5
Send output to the following:
System Memory, System Cores, #Physical Volumnes, Is a VM?
TheWitness
Here is your plan of attack:
Step 0
Increase Spine Threads to 20. However, prior to doing that, increase "max_connections" to 200 in my.cnf and restart mysql.
Step 1
Make this change
Code: Select all
mysql cacti
alter table poller_output modify column output varchar(60) not null default '', engine=memory;
quit;
Send output to the following:
Code: Select all
mysql
show global variables
show global status
Send output to the following:
Code: Select all
du -hs /var/www/html/cacti/rra
Send output to the following:
Code: Select all
df -h /var/www/html/cacti/rra
df -h
Send output to the following:
System Memory, System Cores, #Physical Volumnes, Is a VM?
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Well, I installed from scratch with all lastest version of Cacti, Spine (and so on) on another server and with almost the same number of devices and datasources, my polling takes now 8.5 sec with the current polling settings...
So definitively, there is something weird on my first server. I will rebuilt it from scratch like my second one and it will fix the issue for sure
Thanks for your help!
So definitively, there is something weird on my first server. I will rebuilt it from scratch like my second one and it will fix the issue for sure
Thanks for your help!
Who is online
Users browsing this forum: No registered users and 0 guests