Polling Problems

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Polling Problems

Post by tshuff »

I've looked around the forums and haven't found anything similar to this, so here goes.

I had posted sometime last week about MySQL errors that I was getting when loading pages, the problem is escalating as more devices are added into cacti. As of right now the program is nearly unusable.
We are currently polling 543 devices generating 657 graphs.

I was using Apache 2.0.48 but downgraded to 1.3.29 due to a coworkers advice, so I'm currently using 1.3.29.
Cacti 0.8.5
PHP 4.3.4.4
MySQL 4.0.18
Server is running Windows 2000 on an HP, 2.8ghz Dual Xeon, 2.5GB RAM and is dedicated soley to cacti

I'm using Cactid to poll the devices with the threadcount set to 1 (more than that caused the program to hang) This is the error that cactid returns when it runs (it happens at different points in the polling cycle):

Code: Select all

[14] SNMP v1: agad-15-c2950-01.tcom.purdue.edu, dsname: traffic_in, oid: .1.3.6.
1.2.1.2.2.1.10.49, value: 2422057254
[14] SNMP v1: agad-15-c2950-01.tcom.purdue.edu, dsname: traffic_out, oid: .1.3.6
.1.2.1.2.2.1.16.49, value: 1417058179
RRDCMD: update 'D:\htdocs\cacti\rra\agad15c295001_traffic_in_53.rrd' --template
traffic_in:traffic_out N:2422057254:1417058179
OK
Connecting to MySQL database 'cacti' on '127.0.0.1'...
** Failed: Can't connect to MySQL server on '127.0.0.1' (112)

D:\htdocs\cacti>
On the same note, I can reproduce this error when I try to access cacti on the web:

Code: Select all

Warning: mysql_connect(): Can't connect to MySQL server on '127.0.0.1' (10048) in path-to-cacti\lib\adodb\drivers\adodb-mysql.inc.php on line 318 

Cannot connect to MySQL server on '127.0.0.1'. Please make sure you have specified a valid MySQL database name in 'include/config.php'.
my username, password, and host are all configured correctly, the fact that cacti sometimes works tells me this.

I moved the php scripts all over to a persistent connection, that seems to have elimiated a few of the errors on the website. I attempted to try and load balance cmd.php (by changing the devices it grabs from the SQL) and when I run the poll it seems to work successfully, but writes nothing into the RRDs and no data is graphed.

This graph is a result of the error, it was recently added and is near the bottom of the list.

Image

I appreciate any and all advice you guys can throw my way
tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Post by tshuff »

Anyone?
I certainly think I'm not the only person having this problem
tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Post by tshuff »

Hmm, well I pieced together a hack job to make it work.
made around 6 copies of cmd.php and assigned each one of them a different section of devices to poll and then executed all those with a batch file. Seems to be working pretty well so far.
raX
Lead Developer
Posts: 2243
Joined: Sat Oct 13, 2001 7:00 pm
Location: Carlisle, PA
Contact:

Post by raX »

I am kind of curious in the symptoms you were seeing when using cactid with more than one thread. It almost looks like when the problem occurs it causes MySQL to break. How long does this last? Maybe I'm not handling my MySQL threads correctly. I would be curious to see if other users having these problems are seeing the same thing with MySQL.

Let me know if you would be interested in debugging this a bit futher. There are a few things that we could try to narrow this down.

-Ian
tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Post by tshuff »

Hmm,
First off, the issue with cactid and multiple threads was addressed this thread. The process would randomly hang (usually less than 36 hours) and not terminate. This would then cause the scheduled task to not get executed again until I manually stopped hung task, as you would guess, we sometimes lost many hours of data. Once I set the threadcount to 1 in cactid.conf, the problem was solved.

The latest issue seems to be a problem with Windows itself. I this on MySQL's website (http://dev.mysql.com/doc/mysql/en/Windows_vs_Unix.html):
Limited number of ports
Windows systems have about 4,000 ports available for client connections, and after a connection on a port closes, it takes two to four minutes before the port can be reused. In situations where clients connect to and disconnect from the server at a high rate, it is possible for all available ports to be used up before closed ports become available again. If this happens, the MySQL server will appear to have become unresponsive even though it is running. Note that ports may be used by other applications running on the machine as well, in which case the number of ports available to MySQL is lower.
I knew that Cactid was opening a new SQL connection for every child process, therefore I figured this must have been the problem that I was seeing. I think I figured I was opening around 850 SQL connections per poll, and over 1000 connections for SNMP, add to that normal usage which could amount to another couple hundred every 10 minutes. It seems likeley that I was hitting this connection limit. With my cmd.php modification, I'm only opening 6 sql connections every poll. Time will tell though, I'll have to keep my eye on it throughout the day and see what happens.

I apologize for the incoherence, long week :/
User avatar
Morgan
Cacti User
Posts: 187
Joined: Wed Feb 25, 2004 3:38 am

Post by Morgan »

tshuf.

with that many graphs/devices, how long does it take cmd.php to complete.

try using cactid instead
and move to 0.8.5a + the latest patches for it for the couple of bugs that have been fixed already.
tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Post by tshuff »

I'm running 6 cmd.php processes, balancing the load
Polls run right around 1:00 to 1:20
Cactid caused more problems that it solved, unfortunately. And yes, a move to 0.8.5a is in the works, probably next week or so.
(I have sections of code that I've modified in 0.8.5 to work with my site, will need to hunt down and make the same changes in a)
User avatar
Morgan
Cacti User
Posts: 187
Joined: Wed Feb 25, 2004 3:38 am

Post by Morgan »

tshuff wrote:I'm running 6 cmd.php processes, balancing the load
Polls run right around 1:00 to 1:20
Cactid caused more problems that it solved, unfortunately. And yes, a move to 0.8.5a is in the works, probably next week or so.
(I have sections of code that I've modified in 0.8.5 to work with my site, will need to hunt down and make the same changes in a)
that's why ur having problems with 1 cmd.php and had to move to using multiples. ur polling cycle at 5 minutes your single cmd.php doesnt have time to finish before it starts another. why are u having problems with cactid? what kind of problems. platform?
raX
Lead Developer
Posts: 2243
Joined: Sat Oct 13, 2001 7:00 pm
Location: Carlisle, PA
Contact:

Post by raX »

tshuff wrote:I knew that Cactid was opening a new SQL connection for every child process, therefore I figured this must have been the problem that I was seeing. I think I figured I was opening around 850 SQL connections per poll, and over 1000 connections for SNMP, add to that normal usage which could amount to another couple hundred every 10 minutes. It seems likeley that I was hitting this connection limit. With my cmd.php modification, I'm only opening 6 sql connections every poll. Time will tell though, I'll have to keep my eye on it throughout the day and see what happens.
Verrry interesting... thanks for tracking this down. I wonder if it would be possible to share a single MySQL connection among multiple threads. I suppose that it would, except that you would have to lock around all MySQL accesses which could cause contention issues. Have any general suggestions for possible workarounds for this issue on Windows?

-Ian
tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Post by tshuff »

Best one I've found is to split the load up with multiple cmd.php files, I copied cmd.php multiple times and the line for $polling_items to something like this:

Code: Select all

$polling_items = db_fetch_assoc("select * from data_input_data_cache WHERE local_data_id < '300' AND local_data_id >= '150'");
I then created a batch file to do the following:

Code: Select all

d:
cd \htdocs\cacti\
start /B c:\php\php.exe d:\htdocs\cacti\cmd150.php
start /B c:\php\php.exe d:\htdocs\cacti\cmd300.php
start /B c:\php\php.exe d:\htdocs\cacti\cmd450.php
start /B c:\php\php.exe d:\htdocs\cacti\cmd600.php
start /B c:\php\php.exe d:\htdocs\cacti\cmd750.php
start /B c:\php\php.exe d:\htdocs\cacti\cmd900.php
start /B c:\php\php.exe d:\htdocs\cacti\cmd1050.php
start /B c:\php\php.exe d:\htdocs\cacti\cmdgreater.php
Schedule the batch file to run and you're set

This allows me to split the load very easily. Polling time runs right around 1:15 - 1:30, acceptable.

I've encountered one error since I made this change, and I believe it was a critical error in PHP (process refused to terminate, required a restart). We are now graphing ~850 devices/ports
Willie

cacti

Post by Willie »

This will sound like a dumb question but how did you set up cacti to use a "persistant connection".

Regards,

Willie
tshuff
Posts: 21
Joined: Fri Apr 09, 2004 7:43 am

Post by tshuff »

Keep in mind, I still use Cacti .8.5 -- so this may be different in .8.6
open cacti\lib\database.php

find the line

Code: Select all

	if ($cnn_id->Connect($host,$user,$pass,$db_name)) {
change Connect to PConnect, it should now look as follows:

Code: Select all

	if ($cnn_id->PConnect($host,$user,$pass,$db_name)) {
I had some problems with using persistent connections when in administration mode (I currently have a custom front end written for cacti).
Persistent connections are used for polling, my front end, and graphing, while standard connections are used when I'm in administration (accessing the standard web interface).
Let me know if you need anything further :)
willie

Cacti

Post by willie »

Thanks for the help. I am looking at it to see what that change will do for the standard web app. I am hitting it pretty hard with my poller. I have 470 hosts with 3100 interfaces being polled. cactid started having problems with the new server and I think it is just that the servers are too fast. I never had an problems with connections on my 800mhz or 1.4 ghz test boxes. but I put it up on a dual processor 3ghz and imidiately started having problems saying that could not connect to the database.

Regards,

Willie
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Willie,

Make sure you let me know how this goes for you.

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
tapufd
Cacti User
Posts: 55
Joined: Thu Aug 19, 2004 9:14 am
Location: Belgium

Post by tapufd »

Hi

Changed the Connect to PConnect but still the same problem (using the latest cactid.exe that TheWitness sent me) :
10/30/2004 12:14:57 PM - POLLER: Poller[0] Maximum runtime of 296 seconds exceeded. Exiting.
10/30/2004 12:11:22 PM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
10/30/2004 12:11:21 PM - CACTID: Poller[0] MYSQL: Connection Failed: Can't connect to MySQL server on '127.0.0.1' (112)


If I can help in this problem (testing, ...) please let me know.

/tap
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests