[FIXED] Polling exceeds 1 minute

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

[FIXED] Polling exceeds 1 minute

Post by Tibo »

Hello everybody! As I am kind of new in Cacti and I request your help!

My problem is the following: my Cacti is configured to do polling every 1 minute but it takes now too long and for each polls, it's around 55-57 seconds so I am stuck because I cannot add new devices in my Cacti monitoring now although I don't have so many devices.

09/15/2009 11:06:54 AM - SYSTEM STATS: Time:54.9548 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
09/15/2009 11:05:54 AM - SYSTEM STATS: Time:54.9374 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553
09/15/2009 11:04:56 AM - SYSTEM STATS: Time:56.6375 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:910 RRDsProcessed:553

Is it normal that it takes so long (I have only 57 hosts and 603 datasources). Could it be due to my RDD configuration as I set it to do a 1 minute consolidation for the 2 last months and 5 minutes consolidation for one year (without average)?

Name Steps Rows Timespan**
2 months - 1 min 1 89280 5356800
1 year - 5 min 5 115200 33053184


Any recommendations or advices to tune my settings? Let me know if you need more information to help me on this issue. FYI, it's a prod server so I cannot play too much with the settings in order to avoid as much as I can any data loss.

My poller settings in attachment.

And my server info:
Operating System: Windows 2003 server
Webserver: Apache 2.2.11 (win32)
Cacti: 0.8.7b
Spine: 0.8.7a
MySQL: 2.11.4
PHP: 5.2.5
RRDTool: 1.2.26
Net-SNMP: 5.4.2.1
Attachments
spine.JPG
spine.JPG (127.03 KiB) Viewed 2978 times
Last edited by Tibo on Thu Feb 04, 2010 5:57 am, edited 1 time in total.
mcutting
Cacti Guru User
Posts: 1884
Joined: Mon Oct 16, 2006 5:57 am
Location: United Kingdom
Contact:

Post by mcutting »

Try increasing the number of processes, using amount of cpu cores x 2
Cacti Version 0.8.8b
Cacti OS Ubuntu LTS
RRDTool Version RRDTool 1.4.7
Poller Information
Type SPINE 0.8.8b
engeishi
Cacti User
Posts: 75
Joined: Sun Aug 23, 2009 12:03 pm
Location: Tokyo, Japan

Post by engeishi »

And also try increasing Maximum Threads per Process to 10-15.
This document may help you.
http://docs.cacti.net/manual:087:3a_adv ... pine#spine
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

I proceeded to some tests and I don't see any improvements at all. The time for one polling remains the same.

original setting: 1 process for 6 threads

tested with
2 for 6
1 for 10
2 for 15
4 for 15
4 for 10
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

Hello again. I did a lot of tests this morning changing the poller settings and it's not faster. In the best cases, I have the same time but in some cases I lost 2 or 3 seconds (roughly).

I don't really understand why Spine cannot be faster as my environment is not very huge: 53 hosts with 910 datasources and 553 RRDs processed.

For your information, it's a good server based on a Intel Xeon 3.4 Ghz (4 CPU) with 4 GB of RAM and the OS is a Windows Server 2003 (standard edition with SP2).
engeishi
Cacti User
Posts: 75
Joined: Sun Aug 23, 2009 12:03 pm
Location: Tokyo, Japan

Post by engeishi »

Isn't there possibility that several of your device has the problem to return data?
As you say, I also think your environment is not too large for Spine.

To pinpoints the cause, tern cacti.log to DEBUG level for one polling cycle,
and confirm how much time it takes to collect information on each device.
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

Is it normal that there is around 15 sec between the Spine poller time and the system stats time?

09/18/2009 09:13:57 AM - SYSTEM STATS: Time:57.6000 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:912 RRDsProcessed:554

09/18/2009 09:13:41 AM - SPINE: Poller[0] Time: 41.3430 s, Threads: 6, Hosts: 53
mcutting
Cacti Guru User
Posts: 1884
Joined: Mon Oct 16, 2006 5:57 am
Location: United Kingdom
Contact:

Post by mcutting »

Tibo wrote:Hello again. I did a lot of tests this morning changing the poller settings and it's not faster. In the best cases, I have the same time but in some cases I lost 2 or 3 seconds (roughly).

I don't really understand why Spine cannot be faster as my environment is not very huge: 53 hosts with 910 datasources and 553 RRDs processed.

For your information, it's a good server based on a Intel Xeon 3.4 Ghz (4 CPU) with 4 GB of RAM and the OS is a Windows Server 2003 (standard edition with SP2).
You really should look at what you are polling. SPINE excels at SNMP queries, but scripts such as PERL and WMI will slow it down - particularly if you are querying a server in the Far East, when your Cacti box sits in the US (for example).

Have a look at your queries to see if you can get rid of what you don't really need. Another tool to look at is "pollperf" from Gandalf (search the forums) - this will tell you the runtime of each datasource and host during polling.
Cacti Version 0.8.8b
Cacti OS Ubuntu LTS
RRDTool Version RRDTool 1.4.7
Poller Information
Type SPINE 0.8.8b
mcutting
Cacti Guru User
Posts: 1884
Joined: Mon Oct 16, 2006 5:57 am
Location: United Kingdom
Contact:

Post by mcutting »

Tibo wrote:Is it normal that there is around 15 sec between the Spine poller time and the system stats time?

09/18/2009 09:13:57 AM - SYSTEM STATS: Time:57.6000 Method:spine Processes:1 Threads:6 Hosts:53 HostsPerProcess:53 DataSources:912 RRDsProcessed:554

09/18/2009 09:13:41 AM - SPINE: Poller[0] Time: 41.3430 s, Threads: 6, Hosts: 53
Depends on what plugins are running after the poller has finished collecting data. One way to check would be to disable all plugins, and let the poller have the limelight - the time you get with no plugins is the time it takes for your poller to run.
Cacti Version 0.8.8b
Cacti OS Ubuntu LTS
RRDTool Version RRDTool 1.4.7
Poller Information
Type SPINE 0.8.8b
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

Concerning plugins, I have only Monitor, Thold and Weathermap but I will try to disable them and check how long it takes afterwards.
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

Oki so I disabled them and nothing changed at all... I am puzzled now... :(
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Set the log level to MEDIUM for one pass, and post your Cacti Log.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

In attachment the log for one pass in medium level (I just remove the IP for security reason :wink: ).

Let me know if you find something interesting.

Surprisingly, there is a gap of 15 sec between the poller time and the system time for one pass. I tried to disable all my plugin as I reported before but it didn't change anything.
Attachments
cacti_log.txt
Cacti logs
(132.36 KiB) Downloaded 147 times
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Your issue might be I/O wait. Your spine finishes at 37 seconds, but the poller does not finish until just under the bell. So, there is quite a bit of processing delay.

Here is your plan of attack:

Step 0

Increase Spine Threads to 20. However, prior to doing that, increase "max_connections" to 200 in my.cnf and restart mysql.

Step 1

Make this change

Code: Select all

mysql cacti
alter table poller_output modify column output varchar(60) not null default '', engine=memory;
quit;
Step 2

Send output to the following:

Code: Select all

mysql
show global variables
show global status
Step 3

Send output to the following:

Code: Select all

du -hs /var/www/html/cacti/rra
Step 4

Send output to the following:

Code: Select all

df -h /var/www/html/cacti/rra
df -h
Step 5

Send output to the following:
System Memory, System Cores, #Physical Volumnes, Is a VM?

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Tibo
Posts: 25
Joined: Mon Jul 06, 2009 9:21 am

Post by Tibo »

Well, I installed from scratch with all lastest version of Cacti, Spine (and so on) on another server and with almost the same number of devices and datasources, my polling takes now 8.5 sec with the current polling settings...

So definitively, there is something weird on my first server. I will rebuilt it from scratch like my second one and it will fix the issue for sure :D

Thanks for your help!
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests