spine dies with Lost connection to MySQL server during query

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

User avatar
spyd4r
Posts: 29
Joined: Wed Sep 09, 2009 11:48 am
Location: Waterloo, Ontario

spine dies with Lost connection to MySQL server during query

Post by spyd4r »

cacti-0.8.7i-PIA-3.1 + settings_checkbox.patch
cacti-spine-0.8.7i

openSUSE 12.1 (x86_64)
mysql Ver 14.14 Distrib 5.5.16, for Linux (x86_64) using readline 6.2
PHP 5.3.8 (cli)
RRDtool 1.4.5
NET-SNMP version: 5.7.1

6 CPU
4GB RAM

Maximum Concurrent Poller Processes 6
Maximum Threads per Process 30
Number of PHP Script Servers 10
Script and Script Server Timeout Value 13
The Maximum SNMP OID's Per SNMP Get Request 45

poller and cron are both set at 5 minutes
during polling the poller randomly dies with the following message, can be at the start of the polling process or randomly throughout it.. it also occasionally makes it through the polling process.

Code: Select all

01/31/2012 06:46:34 PM - SPINE: Poller[0] FATAL: MySQL Error:'2013', Message:'Lost connection to MySQL server during query' (Spine thread)
All these settings worked properly on our old cacti machine which was SLES 11.0. I tried playing with MySQL max_connections bumping it up to insanely high numbers just to eliminate the possibility of exhausting the MySQL connections.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: spine dies with Lost connection to MySQL server during q

Post by gandalf »

Asking the dev to have a look
R.
jmax
Posts: 10
Joined: Thu Nov 06, 2008 7:12 am
Location: Guildford, United Kingdom

Re: spine dies with Lost connection to MySQL server during q

Post by jmax »

Hi,

I am having exactly the same issue. I am trying to migrate from an old server running 0.8.7.e to a new server using 0.8.7.i by using cacti-0.8.7i-PIA-3.1.tar.gz for the upgrade and I am getting a lot of poller errors.

08/16/2012 01:31:01 PM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
08/16/2012 01:31:01 PM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
08/16/2012 01:31:01 PM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
08/16/2012 01:31:01 PM - SPINE: Poller[0] FATAL: MySQL Error:'2013', Message:'Lost connection to MySQL server during query' (Spine thread)
08/16/2012 01:31:01 PM - SPINE: Poller[0] WARNING: SQL Failed! Error:'2006', Message:'MySQL server has gone away', Attempting to Reconnect


I disabled all the plugins (no difference), increased the max_connection on mysql (no difference), allocated more memory generally for mysql (no difference), verified the database for errors (none found).
I tried moving back to cmd.php and it is working fine however I can't use that as a solution as it doesn't scale.
Reducing the number of thread and device to poll to a very lower number also seem to make the problem disappear. For example, with a low level of devices to poll, 1 process + 10 threads triggered the issue while 1 process + 4 threads didn't.
But as soon as there are too many thread at the same time, the problem appears again.
I tried to debug mysql but I couldn't find anything suspicious in the logs.

Bellow are the details for Cacti version
General Information
Date Thu, 16 Aug 2012 15:13:15 +0000
Cacti Version 0.8.7i
Cacti OS unix
SNMP Version NET-SNMP version: 5.4.3
RRDTool Version RRDTool 1.4.x
Hosts 1025
Graphs 9663
Data Sources Script/Command: 18
SNMP: 1762
SNMP Query: 7946
Script Query: 4
Script - Script Server (PHP): 1001
Total: 10731
Poller Information
Interval 60
Type SPINE 0.8.7i Copyright 2002-2011 by The Cacti Group
Items Action[0]: 48
Action[1]: 13
Action[2]: 8
Total: 69
Concurrent Processes 2
Max Threads 4
PHP Servers 3
Script Timeout 25
Max OID 10
Last Run Statistics Time:58.8469 Method:spine Processes:1 Threads:4 Hosts:23 HostsPerProcess:23 DataSources:47 RRDsProcessed:0

Php version: 5.3.10-1ubuntu3.2
Mysql version: 5.5.24
Net-snmp version: 5.4.3
RRD-Tool: 1.4.7-1


and the server has plenty of resources so I don't think it is a hardware constraint.

Does anyone have a suggestion on how to troubleshoot it further or on a possible workaround?
jmax
Posts: 10
Joined: Thu Nov 06, 2008 7:12 am
Location: Guildford, United Kingdom

Re: spine dies with Lost connection to MySQL server during q

Post by jmax »

Hi,

I just rebuilt the server with Ubuntu 12.04 and installed cacti 8.7.i with spine (avoiding any problem that could have been caused by the upgrade or any plugins or old stuff) and I ran into the same issue. When using spine and more than a few data source, I get "SPINE: Poller[0] ERROR: SQL Failed! Error:'2013'" errors. When using cmd.php, all is fine.

It suggests that spine doesn't work well with these versions of php or mysql and I don't really know how to troubleshoot this further. Please ping me if you have any suggestion or things to try. In the mean time, I will continue using earlier versions of Ubuntu and Cacti.

Cheers,

JMax
jimmy123
Posts: 5
Joined: Tue Aug 21, 2012 5:36 am

Re: spine dies with Lost connection to MySQL server during q

Post by jimmy123 »

I'm having exactly the same problems, any "treads" value above 2 will give me the errors.
I am running ubuntu 12.04 with cacti 0.88a.

I am running it in a test setup, so the machine has no load to speak of.

Jimmy

copy of my cacti logging:

08/21/2012 12:32:07 PM - SYSTEM STATS: Time:5.1572 Method:spine Processes:8 Threads:8 Hosts:101 HostsPerProcess:13 DataSources:33 RRDsProcessed:31
08/21/2012 12:32:03 PM - SPINE: Poller[0] WARNING: SQL Failed! Error:'2006', Message:'MySQL server has gone away', Attempting to Reconnect
08/21/2012 12:32:03 PM - SPINE: Poller[0] ERROR: SQL Failed! Error:'2013', Message:'Lost connection to MySQL server during query', SQL Fragment:'UPDATE host SET status='3', status_event_count='0', status_fail_date='0000-00-00 00:00:00', status_rec_date='2012-08-20 14:22:00', status_last_error='Host responded to SNMP, UDP: Ping timed out', min_time='1.501440', max_time='10.108000', cur_time='2.251985', avg_time='3.204383', total_polls='757', failed_polls='16', availability='97.8864' WHERE id='49''
08/21/2012 12:32:03 PM - SPINE: Poller[0] WARNING: SQL Failed! Error:'2006', Message:'MySQL server has gone away', Attempting to Reconnect
08/21/2012 12:32:03 PM - SPINE: Poller[0] ERROR: SQL Failed! Error:'2013', Message:'Lost connection to MySQL server during query', SQL Fragment:'UPDATE host SET status='1', status_event_count='877', status_fail_date='0000-00-00 00:00:00', status_rec_date='0000-00-00 00:00:00', status_last_error='Host did not respond to SNMP, ICMP: Host is Alive', min_time='9.999990', max_time='0.000000', cur_time='0.000000', avg_time='0.000000', total_polls='754', failed_polls='754', availability='0.0000' WHERE id='41''
jimmy123
Posts: 5
Joined: Tue Aug 21, 2012 5:36 am

Re: spine dies with Lost connection to MySQL server during q

Post by jimmy123 »

P.S. I tried to do something suggested on another forum-thread, and it didn't work:
"Try to set max_connection limit to sth like 500 instead of default=100. Restart MySQL and httpd afterwards
Reinhard"

I assumed it was the max connection setting for mysql in /etc/mysql/my.cnf.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: spine dies with Lost connection to MySQL server during q

Post by gandalf »

Yes, this was meant. But your MySQL server dies due to some unknown reason. The connection issue appears later (of course). You should find out, why mysql dies initially.
R.
jimmy123
Posts: 5
Joined: Tue Aug 21, 2012 5:36 am

Re: spine dies with Lost connection to MySQL server during q

Post by jimmy123 »

How can I do that?

I'm not well known with mysql.
astrayel
Posts: 1
Joined: Thu Sep 20, 2012 3:26 am

Re: spine dies with Lost connection to MySQL server during q

Post by astrayel »

Hi,

I have the same problem as Spyd4r and Jimmy123. I upgraded in the same time mysql, cacti and spine. Maybe it was a mistake... But for now, I'm not able to make spine function correctly. Here are my versions :
Date Thu, 20 Sep 2012 10:44:16 +0200
Cacti Version 0.8.8a
Cacti OS unix
SNMP Version NET-SNMP version: 5.4.3
RRDTool Version RRDTool 1.4.x
Hosts 245
Graphs 5686
Data Sources Script/Command: 15
SNMP: 997
SNMP Query: 3850
Script Query: 1
Script - Script Server (PHP): 234
Script Query - Script Server: 1066
Total: 6163
Poller Information
Interval 60
Type SPINE 0.8.8a Copyright 2002-2012 by The Cacti Group
Items Action[0]: 9564
Action[1]: 17
Action[2]: 1589
Total: 11170
Concurrent Processes 2
Max Threads 2
PHP Servers 1
Script Timeout 120
Max OID 20
Last Run Statistics Time:55.7137 Method:spine Processes:2 Threads:2 Hosts:239 HostsPerProcess:120 DataSources:11154 RRDsProcessed:6035
PHP Information
PHP Version 5.3.10-1ubuntu3.4
PHP OS Linux
PHP uname Linux ssanetsupva1 3.2.0-30-generic #48-Ubuntu SMP Fri Aug 24 16:52:48 UTC 2012 x86_64
PHP SNMP Installed
max_execution_time 30
memory_limit 1024M
And here are my errors :
SPINE: Poller[0] ERROR: SQL Failed! Error:'2013', Message:'Lost connection to MySQL server during query', SQL Fragment:'INSERT INTO poller_output (local_data_id, rrd_name, time, output) VALUES (9037,'mem_used','2012-09-20 10:42:04','109544472'),(9036,'mem_free','2012-09-20 10:42:04','122232424'),(7834,'TCP','2012-09-20 10:42:04','1
When I set threads number to 1, it works. It is slow (65s) but it works. When I set it to 2, it has sometimes errors, but works approximately. When i set more, it doesn't work anymore and spine crashes every time I execute it. I have no error in error log from mysql and cmdphp works perfectly despite it's running time.

Any clue to resolve or debug this ?

Thanx
Astrayel
chadd
Cacti User
Posts: 382
Joined: Thu Mar 24, 2005 3:53 pm
Location: Ocoee, Florida

Re: spine dies with Lost connection to MySQL server during q

Post by chadd »

Has anyone solved this issue, because I am seeing it as well. Thanks in advance.

-chadd.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: spine dies with Lost connection to MySQL server during q

Post by gandalf »

All, please note whether this occure when running spine (C code) or whether this occurs on the cacti web page (php code). Please add the versions, then
R.
chadd
Cacti User
Posts: 382
Joined: Thu Mar 24, 2005 3:53 pm
Location: Ocoee, Florida

Re: spine dies with Lost connection to MySQL server during q

Post by chadd »

gandalf wrote:All, please note whether this occure when running spine (C code) or whether this occurs on the cacti web page (php code). Please add the versions, then
R.

Here are my stats, and my errors are happening when Spine is running:

Interval 300
Type SPINE 0.8.8a Copyright 2002-2012 by The Cacti Group
Items Action[0]: 317734
Action[1]: 5
Action[2]: 1531
Total: 319270
Concurrent Processes 1
Max Threads 30
PHP Servers 8
Script Timeout 60
Max OID 10
Last Run Statistics Time:38.8263 Method:spine Processes:1 Threads:30 Hosts:972 HostsPerProcess:972 DataSources:319270 RRDsProcessed:0
PHP Information

Thanks.

-chadd.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Re: spine dies with Lost connection to MySQL server during q

Post by gandalf »

Please try two different things, one after the other:

1. Decrease number of processes * number of threads, e.g. use 1 process an 10 threads only
2. increase mysql max_connections to e.g. 500 and restart mysqld/httpd afterwards

R.
chadd
Cacti User
Posts: 382
Joined: Thu Mar 24, 2005 3:53 pm
Location: Ocoee, Florida

Re: spine dies with Lost connection to MySQL server during q

Post by chadd »

gandalf wrote:Please try two different things, one after the other:

1. Decrease number of processes * number of threads, e.g. use 1 process an 10 threads only
2. increase mysql max_connections to e.g. 500 and restart mysqld/httpd afterwards

R.

mysql> show variables like "max_connections";
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 500 |
+-----------------+-------+
1 row in set (0.01 sec)

mysql>


Already at 500 max_connections, but I did decrease the threads to 10. Will let you know what happens.
chadd
Cacti User
Posts: 382
Joined: Thu Mar 24, 2005 3:53 pm
Location: Ocoee, Florida

Re: spine dies with Lost connection to MySQL server during q

Post by chadd »

chadd wrote:
gandalf wrote:Please try two different things, one after the other:

1. Decrease number of processes * number of threads, e.g. use 1 process an 10 threads only
2. increase mysql max_connections to e.g. 500 and restart mysqld/httpd afterwards

R.

mysql> show variables like "max_connections";
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| max_connections | 500 |
+-----------------+-------+
1 row in set (0.01 sec)

mysql>


Already at 500 max_connections, but I did decrease the threads to 10. Will let you know what happens.

Still happening.. Not as much (I think), but here are the log entries:

11/02/2012 12:41:37 PM - SPINE: Poller[0] ERROR: SQL Failed! Error:'2013', Message:'Lost connection to MySQL server during query', SQL Fragment:'INSERT INTO poller_output (local_data_id, rrd_name, time, output) VALUES (449942,'cisco_memused','2012-11-02 12:41:36','48114060'),(449941,'cisco_memfree','2012-11-02 12:41:36','139577388'),(449940,'oneMin','2012-11-02 12:41:36','U'),(449940,'fiveMin','2012-11-02 12:41:36','U'),(449939,'errors_out','2012-11-02 12:41:36','0'),(449939,'discards_out','2012-11-02 12:41:36','0'),(449939,'discards_in','2012-11-02 12:41:36','0'),(449939,'errors_in','2012-11-02 12:41:36','0'),(449938,'errors_out','2012-11-02 12:41:36','0'),(449938,'discards_out','2012-11-02 12:41:36','0'),(449938,'discards_in','2012-11-02 12:41:36','0'),(449938,'errors_in','2012-11-02 12:41:36','0'),(449937,'errors_out','2012-11-02 12:41:36','0'),(449937,'discards_out','2012-11-02 12:41:36','0'),(449937,'discards_in','2012-11-02 12:41:36','0'),(449937,'errors_in','2012-11-02 12:41:36','0'),(449936,'errors_out','2012-11-02 12:41:36','0'),(449936,'discards_out','2012-11-02 12:41:36','0'),(449936,'discards_in','2012-11-02 12:41:36','0'),(449936,'errors'

11/02/2012 12:01:35 PM - SPINE: Poller[0] ERROR: SQL Failed! Error:'2013', Message:'Lost connection to MySQL server during query', SQL Fragment:'INSERT INTO poller_output (local_data_id, rrd_name, time, output) VALUES (451119,'ProcMem_Nexus','2012-11-02 12:01:35','16'),(451118,'5min_cpu_nexus','2012-11-02 12:01:35','5'),(451117,'errors_out','2012-11-02 12:01:35','0'),(451117,'discards_out','2012-11-02 12:01:35','0'),(451117,'discards_in','2012-11-02 12:01:35','0'),(451117,'errors_in','2012-11-02 12:01:35','0'),(451116,'errors_out','2012-11-02 12:01:35','0'),(451116,'discards_out','2012-11-02 12:01:35','0'),(451116,'discards_in','2012-11-02 12:01:35','0'),(451116,'errors_in','2012-11-02 12:01:35','0'),(451115,'errors_out','2012-11-02 12:01:35','0'),(451115,'discards_out','2012-11-02 12:01:35','0'),(451115,'discards_in','2012-11-02 12:01:35','0'),(451115,'errors_in','2012-11-02 12:01:35','0'),(451114,'errors_out','2012-11-02 12:01:35','0'),(451114,'discards_out','2012-11-02 12:01:35','0'),(451114,'discards_in','2012-11-02 12:01:35','0'),(451114,'errors_in','2012-11-02 12:01:35','0'),(451113,'errors_out','2012-11-02 12:01:35','0'),(451113,'discards_out''

Oh, and poll time went from about 37 sec to about 110 sec.

-chadd.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest