Submit Your CMD.PHP vs. SPINE Metrics Here

Important information about Cacti developments that all users should be interested in.

Moderators: Developers, Moderators

Post Reply
Frizz
Cacti User
Posts: 80
Joined: Sat Mar 05, 2005 5:07 pm
Location: Herne Germany

Post by Frizz »

copo wrote:I'm trying to tune the performance on my box, and I have tried some of the changes to enhance the time for processing each polling cycle. Here are my config:

IBM x346
2 x XeonMP 3G
2 x SATA software mirror
1G RAM (another 3G is coming)
Fedora 7
Cacti 0.8.6j
Plugin Architecture 1.1
RRDTool 1.2.23

Thanks.
Hi copo,
exciting, is it only the swap within Cacti (Setiings/path etc) ?
Or do you have also upgraded to boost ?
Have you kept your RRDtool libraries unchanged or new compiled (versions?).
The RRD tool update processing is the real inhibition in the poller.php. With cactid the SNMP polling is well threaded, but nearly 2-third of the polling time is used by RRDtool updates (in mostly SNMP driven enviroments, no cmd scripts etc.)
Your metrics raise my hope for real great implementations with more than 100k Data sources or 50k RDD's.
Keep the community on track if you extendes your metrics.
Best regards
Frizz
Cacti 0.8.6j | Cactid 0.8.6j | RRDtool 1.2.23 |
SuSe 9.x | PHP 4.4.4 | MySQL 5.0.27 | IHS 2.0.42.1
Come and join the 3.CCC.eu
http://forums.cacti.net/viewtopic.php?t=27908
copo
Posts: 11
Joined: Mon Oct 09, 2006 1:35 am

Post by copo »

Hi Frizz,

What I have done is to download the rrdtool 1.3 beta, compile it and install it to "/usr/local/rrdtool-1.2.99907080300/", then change the RRDTool path in the cacti setting then it will use the beta rrdtool to update the rra. Then the processing time for each cycle drop to about one-fifth right the way. I think the update syntax for both the stable release and beta rrdtool are similar, but how the rra being physically updated on the disk seems to have improved. When I look at the loading history using the command "sar", the iowait have been lower for more than a half after changing to RRDTool 1.3 beta.

1.2.23

Code: Select all

09:30:02 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle
09:40:02 AM     all      4.51      0.00      0.91     14.79      0.00     79.80
09:50:01 AM     all      4.58      0.05      0.90     14.47      0.00     80.00
10:00:01 AM     all      4.54      0.00      0.87     13.43      0.00     81.16
10:10:02 AM     all      4.57      0.00      0.92     16.46      0.00     78.04
10:20:01 AM     all      4.59      0.00      0.89     13.06      0.00     81.46
10:30:01 AM     all      4.55      0.00      0.87     13.23      0.00     81.35
10:40:02 AM     all      4.59      0.00      0.91     14.80      0.00     79.70
10:50:02 AM     all      4.68      0.05      0.93     14.17      0.00     80.17
11:00:02 AM     all      5.32      0.00      0.98     13.55      0.00     80.15
11:10:01 AM     all      5.79      0.00      1.04     14.95      0.00     78.22
11:20:01 AM     all      4.68      0.00      0.89     13.49      0.00     80.94
11:30:01 AM     all      5.52      0.00      0.93     13.40      0.00     80.15
11:40:01 AM     all      6.32      0.00      0.98     14.82      0.00     77.88

After change to 1.3 beta

Code: Select all

11:50:01 AM     all      6.73      0.05      1.05     14.00      0.00     78.17
12:00:01 PM     all      5.44      0.00      0.93     12.90      0.00     80.73
12:10:02 PM     all      8.11      0.00      1.81     17.41      0.00     72.67
12:20:01 PM     all      7.17      0.00      0.95      5.51      0.00     86.37
12:30:01 PM     all      5.79      0.00      0.79      4.13      0.00     89.28
12:40:01 PM     all      7.13      0.00      0.96      6.76      0.00     85.15
12:50:01 PM     all      7.62      0.05      1.02      4.48      0.00     86.83
01:00:01 PM     all      7.48      0.00      0.95      4.15      0.00     87.42
01:10:01 PM     all      5.71      0.00      0.82      5.45      0.00     88.02
01:20:01 PM     all      5.71      0.00      0.79      4.10      0.00     89.40
01:30:01 PM     all      5.70      0.00      0.79      3.77      0.00     89.75
01:40:01 PM     all      5.72      0.00      0.82      5.09      0.00     88.36
01:50:01 PM     all      5.63      0.05      0.79      3.94      0.00     89.60
02:00:01 PM     all      5.70      0.00      0.78      4.08      0.00     89.45
02:10:01 PM     all      5.56      0.00      0.85      8.90      0.00     84.69
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

copo wrote:When I look at the loading history using the command "sar", the iowait have been lower for more than a half after changing to RRDTool 1.3 beta.
RRDTool now uses fadvise to reduce read cache usage while it will never use that cache.
Reinhard
DLNoah
Cacti User
Posts: 119
Joined: Wed Jun 20, 2007 11:27 pm

Post by DLNoah »

System:
P4 HT 2.6GHz
1GB RAM
Win Server 2003 SP2
Cacti 0.8.6j from the March version of BSOD's installer
PHP 5.2.3

Plugins: Thold, Uptime, Discovery (disabled)

09/17/2007 09:17:04 PM - SYSTEM STATS: Time:123.5622 Method:cactid Processes:2 Threads:8 Hosts:113 HostsPerProcess:57 DataSources:3511 RRDsProcessed:1856

Only a few scripts here and there, mostly SNMP Queries. Also have 40-odd hosts that recache on pretty much every run due to the indexes for what I'm trying to get being a little funkay, stats are as follows:

09/17/2007 09:18:01 PM - RECACHE STATS: RecacheTime:52.2566 HostsRecached:41

Any suggestions as how to optimize? This server is used pretty much exclusively for cacti monitoring and techs being able to remote in to do troubleshooting on the network as needed.

Edit:

Hrm, after I post, the next two poller runs look like

09/17/2007 09:21:08 PM - SYSTEM STATS: Time:67.6637 Method:cactid Processes:2 Threads:8 Hosts:113 HostsPerProcess:57 DataSources:3511 RRDsProcessed:1856

09/17/2007 09:25:52 PM - SYSTEM STATS: Time:52.2409 Method:cactid Processes:2 Threads:8 Hosts:113 HostsPerProcess:57 DataSources:3510 RRDsProcessed:1855

Ain't technology grand? The recache is running 50-60s still, is there anything that can help that, or would that be considered good for 40-45 hosts per run?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

What is your reindex method? Verify all fields does this.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
DLNoah
Cacti User
Posts: 119
Joined: Wed Jun 20, 2007 11:27 pm

Post by DLNoah »

Yes, one data query I'm running requires Verify All Fields in order to work properly (the device doesn't always restart SNMP reliably upon a reindex). Just curious if there was anything that could cut down on execution time of the recaching.
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Linux for sure. Some day I might write that in C. Windows is abhorently slow forking php processes. Lot's of overhead. This is the primary issue.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
sh0x
Posts: 32
Joined: Thu Aug 30, 2007 6:12 pm
Location: California

Post by sh0x »

Hi, here are my cactid poller stats using cacti 0.8.6j and cactid v0.8.6i with rrdtool 1.2.23.

Code: Select all

10/03/2007 10:38:39 AM - SYSTEM STATS: Time:13.9522 Method:cactid Processes:1 Threads:8 Hosts:46 HostsPerProcess:46 DataSources:3429 RRDsProcessed:1372
I have a question about a small delay during the polling cycle, part way through the poll cactid seems to pause.. so I debug, and during the pause the log shows cacti doing the following for a bunch of ds:

Code: Select all

10/03/2007 10:17:30 AM - CMDPHP: Poller[0] DEBUG: SQL Exec: "delete from poller_output where local_data_id='2189' and rrd_name='errors_in' and time='2007-10-03 10:17:21'"
What is cacti doing at this point, is there anything I can do to improve performance further? It seems to double my polling time.

Thanks,
sh0x
sh0x
Posts: 32
Joined: Thu Aug 30, 2007 6:12 pm
Location: California

Post by sh0x »

I think I see how it works, cactid stores the returned data in poller_output so the rrds can be updated, then they are removed.. and the "delay" that i described is really the rrds being updated.. is that right?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Yes. The delete can be a bit of a load as well as the RRDupdates. Especially if you run out of disk cache. Solutions, 1) More Memory, 2) Faster Disk, 3) RRDtool 1.3, 4) Boost.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
sh0x
Posts: 32
Joined: Thu Aug 30, 2007 6:12 pm
Location: California

Post by sh0x »

It worked. :)

I run cacti in a virtual machine, so I moved it to a more robust hardware platform and the cactid poller times reduced by half. Before the upgrade the poller times were ~14s, afterwards they dropped to ~6s.

Before
10/04/2007 11:40:17 AM - SYSTEM STATS: Time:14.3754 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

After
10/04/2007 01:04:35 PM - SYSTEM STATS: Time:6.0754 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

Graph generation is way faster too. I still have the same 1G of memory, but now I have 2 CPU instead of 1, and faster disks. All of my devices are routers being polled across WAN links from T1 to OC3, all links are <20ms round-trip latency. Next I'll try boost, some memory tweaks and rrdtool 1.3, but at this point I'll wait for the next official cacti/cactid release first. Thanks, cacti is great and keeps getting better!

sh0x
-
cacti 0.8.6j - cactid v0.8.6i - rrdtool 1.2.23 - php 5.1.6 - mysql 5.0.22 - snmp 5.3.1 w/mibs
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

You're running quite few hosts but significant data source load. So I suppose, you're using SNMPV2 and snmpbulkwalk, am I correct? Which numbers do you use with bulkwalk?
Reinhard
sh0x
Posts: 32
Joined: Thu Aug 30, 2007 6:12 pm
Location: California

Post by sh0x »

Yes, I use snmpv2. I thought cacti would automatically use snmpbulkwalk when using snmp v2 with walk, is that correct? Currently I walk the following tables on all routers. Many of these routers have 4 T1s between two carriers, configured as 2xT1 multi-link frame-relay bundle pairs. Each bundle has two virtual circuits connecting back to two active data centers. After tracking throughput, errors, drops, cpu, mem, temp, and layer1-2 stats, I have ~3400 data sources.

IF-MIB - ifTable (traffc, errors, drops)
RFC1315-MIB - frCircuitTable (fecn, becn, de, cir)
RFC1406-MIB - dsx1CurrentTable (lcv, pcv, uas, es)
RFC1407-MIB - dsx3CurrentTable (lcv, pcv, uas, es)
CISCO-MEMORY-MIB - ciscoMemoryPoolTable (used, free, largest free)
CISCO-ENVMON-MIB - ciscoEnvMonMIB (temp - cpu, internal, exhaust, ambient, threshold)

The only 'get' I have is for CISCO-PROCESS-MIB (5s, 1m, and 5m CPU). I was using an snmp query but I went back to a graph template since I only wanted to graph the 1st index, but I may change this.
Last edited by sh0x on Sat Oct 06, 2007 12:30 pm, edited 1 time in total.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

sh0x wrote:Yes, I use snmpv2. I thought cacti would automatically use snmpbulkwalk when using snmp v2 with walk, is that correct?
Sure. But the Maximum OID get size may be changed using Settings->Poller. You may increase this number with caution. Remember the maximum number given in the explanation of that very field.
Personally, I'm not running such a huge installation and most stuff is still V1. So I do not have any experience on this settings.
Reinhard
sh0x
Posts: 32
Joined: Thu Aug 30, 2007 6:12 pm
Location: California

Post by sh0x »

Oh those numbers. I had set it to the max - 60 (all new routers). Here are the results with various settings. I think I'll set this to 5 but let me know if you have a suggestion.

1 OID MAX
10/06/2007 10:00:43 AM - SYSTEM STATS: Time:7.9084 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:01:06 AM - SYSTEM STATS: Time:7.6239 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

2 OID MAX
10/06/2007 10:23:33 AM - SYSTEM STATS: Time:7.6761 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:23:46 AM - SYSTEM STATS: Time:7.5392 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

3 OID MAX
10/06/2007 10:23:58 AM - SYSTEM STATS: Time:7.6601 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:24:12 AM - SYSTEM STATS: Time:7.6211 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

4 OID MAX
10/06/2007 10:24:22 AM - SYSTEM STATS: Time:6.4277 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:24:33 AM - SYSTEM STATS: Time:6.3760 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

5 OID MAX
10/06/2007 10:01:30 AM - SYSTEM STATS: Time:6.4671 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:01:41 AM - SYSTEM STATS: Time:6.4588 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

10 OID MAX
10/06/2007 10:02:12 AM - SYSTEM STATS: Time:6.5187 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:02:21 AM - SYSTEM STATS: Time:6.4481 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

20 OID MAX
10/06/2007 10:02:52 AM - SYSTEM STATS: Time:6.3540 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:03:01 AM - SYSTEM STATS: Time:6.5597 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

40 OID MAX
10/06/2007 10:03:35 AM - SYSTEM STATS: Time:6.5562 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:03:46 AM - SYSTEM STATS: Time:6.3293 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357

60 OID MAX
10/06/2007 10:04:13 AM - SYSTEM STATS: Time:6.6726 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
10/06/2007 10:04:22 AM - SYSTEM STATS: Time:6.2634 Method:cactid Processes:2 Threads:8 Hosts:46 HostsPerProcess:23 DataSources:3383 RRDsProcessed:1357
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest