cactid - weird performance issues

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

jennam
Posts: 8
Joined: Wed Jul 12, 2006 10:13 pm
Location: Adelaide, Australia

cactid - weird performance issues

Post by jennam »

Hi,

I've got a reasonably large cactid setup here, and I'm having major problems in terms of performance. I'm currently running on cacti 0.8.6h, and cactid 0.8.6g (though oddly enough it seems to claim it's 0.8.6f unless you autoconf it manually). This is on a RHEL4 system, dual PIII-1ghz, 1GB ram. GCC is v3.4.5. I've tried what I can to play with buffers and other settings under MySQL without any real help to performance.

As is requried with this leve, I've updated php.ini with:

memory_limit = 512M

And the system seems to have plenty of RAM:

Mem: 1164720k total, 1135348k used, 29372k free, 15712k buffers
Swap: 2031608k total, 160k used, 2031448k free, 823568k cached

It doesn't seem to hit swap at all during cactid running. It also doesnt seem like CPU is much of a bottleneck during running, it rarely hits more than 1-2 load average.

Anyways, the poller is running longer than the 300 seconds quite often and is generally just failing to update/graph many things. I'm seeing huge gaps in the order of many hours, and some devices are only managing to pick up one or two 'spots' on the graphs over a 24 hour period. Very worrisome. The earlier added devices seem to have fewer problems, but later added stuff (which a higher ID) seem to have more issues.

Statistics:

07/13/2006 12:45:19 PM - CACTID: Poller[0] Time: 16.6815 s, Threads: 20, Hosts: 71
07/13/2006 12:45:21 PM - CACTID: Poller[0] Time: 15.9549 s, Threads: 20, Hosts: 70
07/13/2006 12:45:24 PM - CACTID: Poller[0] Time: 20.6529 s, Threads: 20, Hosts: 72
07/13/2006 12:50:24 PM - CACTID: Poller[0] Time: 16.3340 s, Threads: 20, Hosts: 70
07/13/2006 12:50:24 PM - CACTID: Poller[0] Time: 20.0509 s, Threads: 20, Hosts: 71
07/13/2006 12:50:25 PM - SYSTEM STATS: Time:322.7450 Method:cactid Processes:3 Threads:20 Hosts:211 HostsPerProcess:71 DataSources:11601 RRDsProcessed:6546
07/13/2006 12:50:26 PM - CACTID: Poller[0] Time: 20.4675 s, Threads: 20, Hosts: 72
07/13/2006 12:51:12 PM - SYSTEM STATS: Time:67.4201 Method:cactid Processes:3 Threads:20 Hosts:211 HostsPerProcess:71 DataSources:11601 RRDsProcessed:1262

07/13/2006 12:55:19 PM - CACTID: Poller[0] Time: 17.6446 s, Threads: 20, Hosts: 71
07/13/2006 12:55:20 PM - CACTID: Poller[0] Time: 17.2423 s, Threads: 20, Hosts: 72
07/13/2006 12:55:22 PM - CACTID: Poller[0] Time: 18.1822 s, Threads: 20, Hosts: 70
07/13/2006 12:58:33 PM - SYSTEM STATS: Time:211.1576 Method:cactid Processes:3 Threads:20 Hosts:211 HostsPerProcess:71 DataSources:11601 RRDsProcessed:6546

07/13/2006 01:00:22 PM - CACTID: Poller[0] Time: 19.0130 s, Threads: 20, Hosts: 71
07/13/2006 01:00:24 PM - CACTID: Poller[0] Time: 20.3751 s, Threads: 20, Hosts: 72
07/13/2006 01:00:25 PM - CACTID: Poller[0] Time: 20.8446 s, Threads: 20, Hosts: 70
07/13/2006 01:05:19 PM - CACTID: Poller[0] Time: 14.9615 s, Threads: 20, Hosts: 71
07/13/2006 01:05:24 PM - CACTID: Poller[0] Time: 18.1758 s, Threads: 20, Hosts: 70
07/13/2006 01:05:24 PM - CACTID: Poller[0] Time: 19.6094 s, Threads: 20, Hosts: 72
07/13/2006 01:05:53 PM - SYSTEM STATS: Time:351.1019 Method:cactid Processes:3 Threads:20 Hosts:211 HostsPerProcess:71 DataSources:11601 RRDsProcessed:6546
07/13/2006 01:06:15 PM - SYSTEM STATS: Time:71.1151 Method:cactid Processes:3 Threads:20 Hosts:211 HostsPerProcess:71 DataSources:11601 RRDsProcessed:1123

The thing that I find most 'odd' is that it says the pollers are exitting/finishing within quite great times, but them the poller.log itself for cactid shows it's still processing for much longer, and doesnt seem to finish for a -loooong- time after.

Does anyone have any magical ideas what I'm doing wrong, where the bottleneck might be or some setting I can do with cactid to find out what on earth it's doing to take so long?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Well,

It looks like Cacti is steping all over itself. You should halt polling for a short time to clear out all the cactid poller processes. Then restart polling. Just do a "ps -ef | grep cactid" and kill those processes. Also, kill the "php poller.php" processes as well.

Then, I am planning on rolling a sticky for users who want to reduce load average and that is to make the "poller_output" table a memory storage engine table. If you have MySQL Query Browser, you can simply edit the table and goto the storage engine tab and select the memory storage engine. Then watch in Top as you system load during polling drops dramatically.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
jennam
Posts: 8
Joined: Wed Jul 12, 2006 10:13 pm
Location: Adelaide, Australia

Post by jennam »

I've tried killing the poller processes previously, and they just end up steppign all over each other again later, and then coming back again.

It's a fairly frustrating situation. I've tried anything from playing with the threads, processes, etc settings, but nothing seems to resolve the issue.

I can't seem to understand what cactid is doing after it appears to be finished polling; Is it actually finished, or is it continuing to receieve data? Are you able to clarify for me what's going on? I'm guessing based on what you're saying with regards to 'poller_output' you're suggesting perhaps I'm hitting a disk-level bottleneck with MySQL for storing of the poller data and subsequently pushing into the RRDs?

With regards to the "poller_output" table, am I able to do that through the CLI mysql tool? Unfortunately I dont have/use any other tools to talk to MySQL.
jennam
Posts: 8
Joined: Wed Jul 12, 2006 10:13 pm
Location: Adelaide, Australia

Post by jennam »

Okay, this is interesting.

Crontab is running the poller using:

*/5 * * * * cactiuser /usr/bin/php /var/www/html/cacti/poller.php >/var/www/html/cacti/log/poller.log 2>&1

from the /etc/cron.d/cacti

The log of poller.log starts with:

Content-type: text/html^M
X-Powered-By: PHP/4.3.9^M
^M
Waiting on 3/3 pollers.

...
And then just continues, and continues and continues whilst diplaying:

OK u:0.73 s:5.02 r:313.26
OK u:0.73 s:5.02 r:313.38
OK u:0.73 s:5.02 r:313.47
OK u:0.73 s:5.02 r:315.18

Until it eventually ends and finishes. The strange thing is, I'm sure I remember that the poller is supposed to bail out if it exceeds 300 seconds, but it doesnt seem to be doing this anymore (Any idea where the setting for that is?)

Either way, during that time on ps -ef:

500 1466 1464 0 14:30 ? 00:00:00 /bin/sh -c /usr/bin/php /var/www/html/cacti/poller.php >/var/www/html/cacti/log/poller.log 2>&1
500 1467 1466 1 14:30 ? 00:00:04 /usr/bin/php /var/www/html/cacti/poller.php
root 2182 19773 0 14:37 pts/2 00:00:00 grep cacti

So there's no cactid processes that it's waiting on that I can see that's actually running; Does that mean the cactid polling is done, and it's just doing the poller.php from that point onwards? I thought I understood how cacti worked, but I'm starting to get confused how it all fits together now.

Jen
jennam
Posts: 8
Joined: Wed Jul 12, 2006 10:13 pm
Location: Adelaide, Australia

Post by jennam »

If I run cactid manually from the command-line, I see:

# time cactid --verbosity=3
*various polling*
.
.
CACTID: Time: 34.8633 s, Threads: 20, Hosts: 211

real 0m35.666s
user 0m1.616s
sys 0m1.506s

#

... Which suggests to me that cactid's actual polling is working correctly, and it's able to talk to the hosts in short order. So then am I correct in assuming that it's the poller.php that I'm having the bottleneck through, and the linking between MySQL polling table and the RRD files?

It occurs to me in retrospect that I do also have fairly large .rrd files for each datasource; Eg:

-rw-r--r-- 1 cactiuser cactiuser 1731848 Jul 13 14:42 host_traffic_in_5415.rrd

The RRA settings are:

Name Steps Rows Timespan
Daily (5 Minute Average) 1 10000 86400
Weekly (15 Minute Average) 3 15000 604800
Monthly (1 Hour Average) 12 10000 2678400
Yearly (1 Day Average) 288 1000 33053184

Perhaps this is also contributing to the problem?

Jenna
mvam
Cacti User
Posts: 87
Joined: Wed Jun 01, 2005 2:00 pm
Location: Seattle

Post by mvam »

the problem is that the memory storage type doesnt support Blobbing.

ALTER TABLE `poller_output` ENGINE = memory

#1163 - The used table type doesn't support BLOB/TEXT columns
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Oh, sorry about that. If you are not returning very large results, change this field to "varchar(255)". Otherwise, if you are running MySQL >= 5.0.3, you can go "varchar(1024)" for example.

I am thinking that this problem is related to RRDtool. You can force RRDtool out of pipe mode by making a few alterations. Let me see if I can find the change I provided some time ago. Either way, the memory storage engine will reduce cpu utilization.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
mvam
Cacti User
Posts: 87
Joined: Wed Jun 01, 2005 2:00 pm
Location: Seattle

Post by mvam »

sorry for hijacking your thread jennam.

the witness, i was having a similar problem as jennam. but anyway i made the changes you suggested and i havent had a php seg fault in almost 2 hrs and every poll cycle has completed so far. every other cycle is still long but at least they dont time out. i still see stale pollers however

old
cacti 1529 0.0 0.2 87152 17796 ? S 12:50 0:00 /usr/bin/php -q /appl/www/cactig/cmd.php 250 270
cacti 3040 0.0 0.2 87084 17732 ? S 12:55 0:00 /usr/bin/php -q /appl/www/cactig/cmd.php 250 270

current
cacti 4367 0.0 0.0 7064 1456 ? Ss 13:00 0:00 /bin/sh -c php /appl/www/cactig/poller.php > /dev/null
cacti 4375 0.6 0.3 94048 24644 ? S 13:00 0:01 php /appl/www/cactig/poller.php 2
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

All,

Please try the following "lib/rrd.php" file and provide feedback on poller performance.

TheWitness
Attachments
rrd.zip
(11.07 KiB) Downloaded 79 times
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
mvam
Cacti User
Posts: 87
Joined: Wed Jun 01, 2005 2:00 pm
Location: Seattle

Post by mvam »

so far i have

07/13/2006 01:54:54 PM - SYSTEM STATS: Time:293.4823 Method:cmd.php Processes:18 Threads:N/A Hosts:310 HostsPerProcess:18 DataSources:6922 RRDsProcessed:4682

07/13/2006 01:57:05 PM - SYSTEM STATS: Time:123.9718 Method:cmd.php Processes:18 Threads:N/A Hosts:310 HostsPerProcess:18 DataSources:6922 RRDsProcessed:4404

07/13/2006 02:04:54 PM - SYSTEM STATS: Time:292.9109 Method:cmd.php Processes:18 Threads:N/A Hosts:310 HostsPerProcess:18 DataSources:6922 RRDsProcessed:4691

07/13/2006 02:06:43 PM - SYSTEM STATS: Time:101.8599 Method:cmd.php Processes:18 Threads:N/A Hosts:310 HostsPerProcess:18 DataSources:6922 RRDsProcessed:4588

07/13/2006 02:12:25 PM - SYSTEM STATS: Time:143.8159 Method:cmd.php Processes:18 Threads:N/A Hosts:310 HostsPerProcess:18 DataSources:6922 RRDsProcessed:4759

which isnt really a change. was this change specifically meant for cactid?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

No, but how many proccessors does that box have. 18 concurrent processes is pretty high.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
mvam
Cacti User
Posts: 87
Joined: Wed Jun 01, 2005 2:00 pm
Location: Seattle

Post by mvam »

2 opteron 280s
so 4 cores
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Do no more than 6 processes.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
mvam
Cacti User
Posts: 87
Joined: Wed Jun 01, 2005 2:00 pm
Location: Seattle

Post by mvam »

ok, here are my last 2 polls

07/13/2006 03:51:31 PM - SYSTEM STATS: Time:89.6538 Method:cmd.php Processes:18 Threads:N/A Hosts:310 HostsPerProcess:18 DataSources:6922 RRDsProcessed:4620

07/13/2006 03:58:33 PM - SYSTEM STATS: Time:211.9116 Method:cmd.php Processes:6 Threads:N/A Hosts:310 HostsPerProcess:52 DataSources:6922 RRDsProcessed:4656

since it varies so much it might settle out after a while with 6 procs
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Do a few runs in MEDIUM and send me the output.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 9 guests