random: ERROR: ping_icmp: cannot open an ICMP socket

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
brylant
Posts: 36
Joined: Mon Aug 17, 2009 9:05 am

random: ERROR: ping_icmp: cannot open an ICMP socket

Post by brylant »

Hi,

I've been running cacti/spine for past few weeks with no major issues but there's one thing that make me a bit worried. I've got these messages appearing in the log:

Code: Select all

09/23/2009 10:55:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
09/23/2009 10:50:02 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
09/23/2009 10:50:02 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
09/23/2009 10:50:02 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
09/23/2009 10:50:02 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
09/23/2009 10:50:02 AM - SPINE: Poller[0] ERROR: ping_icmp: cannot open an ICMP socket (Spine thread)
This is happening randomly (pretty much once a day). Looking at spine source this is caused by snmp socket not being opened.
I have several routers in cacti (and ~5k data sources). I plan to have at least 2-3 times more and I'm a bit worried...
Does anobody know what the problem is...?

Thanks,
adam

cacti: 0.8.7e, spine: 0.8.7e
all official patches applied to both cacti and spine...
OS: ubuntu 8.04 LTS

Code: Select all

Cacti Version 	0.8.7e
Cacti OS 	unix
SNMP Version 	NET-SNMP version: 5.4.1
RRDTool Version 	RRDTool 1.2.x
Hosts 	43
Graphs 	4736
Data Sources 	SNMP: 181
SNMP Query: 4244
Script - Script Server (PHP): 4
Script Query - Script Server: 356
Total: 4785

Code: Select all

Interval  	300
Type 	spine
Items 	Action[0]: 8521
Action[2]: 1008
Total: 9529
Concurrent Processes 	2
Max Threads 	10
PHP Servers 	4
Script Timeout 	30
Max OID 	30
Last Run Statistics 	Time:24.4114 Method:spine Processes:2 Threads:10 Hosts:43 HostsPerProcess:22 DataSources:9529 RRDsProcessed:4785

Code: Select all

PHP Version  	5.2.4-2ubuntu5.6
PHP OS 	Linux
PHP uname 	Linux *** 2.6.24-24-server #1 SMP Fri Jul 24 22:44:54 UTC 2009 x86_64
PHP SNMP 	Installed
max_execution_time 	60
memory_limit 	196M
Deviloper
Cacti User
Posts: 256
Joined: Tue Jul 07, 2009 8:03 am

Post by Deviloper »

You could enable verbose spine debugging (see spine command line parameter) to get more information on what is going on.

* COMMAND-LINE PARAMETERS
*
* -h | --help
* -v | --version
*
* Show a brief help listing, then exit.
*
* -C | --conf=F
*
* Provide the name of the Spine configuration file, which contains
* the parameters for connecting to the database. In the absense of
* this, it looks [WHERE?]
*
* -f | --first=ID
*
* Start polling with device <ID> (else starts at the beginning)
*
* -l | --last=ID
*
* Stop polling after device <ID> (else ends with the last one)
*
* -H | --hostlist="hostid1,hostid2,hostid3,...,hostidn"
*
* Override the expected first host, last host behavior with a list of hostids.
*
* -O | --option=setting:value
*
* Override a DB-provided value from the settings table in the DB.
*
* -C | -conf=FILE
*
* Specify the location of the Spine configuration file.
*
* -R | --readonly
*
* This processing is readonly with respect to the database: it's
* meant only for developer testing.
*
* -S | --stdout
*
* All logging goes to the standard output
*
* -V | --verbosity=V
*
* Set the debug logging verbosity to <V>. Can be 1..5 or
* NONE/LOW/MEDIUM/HIGH/DEBUG (case insensitive).
*
* The First/Last device IDs are all relative to the "hosts" table in the
* Cacti database, and this mechanism allows us to split up the polling
* duties across multiple "spine" instances: each one gets a subset of
* the polling range.
*
* For compatibility with poller.php, we also accept the first and last
* device IDs as standalone parameters on the command line.
brylant
Posts: 36
Joined: Mon Aug 17, 2009 9:05 am

Post by brylant »

You could enable verbose spine debugging
that won't probably help (I know it's system "socket" call and logging 5k datasource results won't help my resource utilization ;-)):

Code: Select all

        /* get ICMP socket */
        if ((icmp_socket = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP)) == -1) {
                die("ERROR: ping_icmp: cannot open an ICMP socket");
        }
I was just wondering if anyone has any idea why that system call fails...
User avatar
TheWitness
Developer
Posts: 16997
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Wow! Maybe we should put that in a loop for a few retries. Your system must have several thousand open ports. Between attempts, you should perform a usleep(10000) or so, to give the system a chance to "calm" down a bit. Then, instead of dying, we should be returning. It's likely returning EAGAIN (which means system busy, try again).

Since this is the first time I've seen this, I now realize that dying is a bad idea. However, the script server should restart as expected. Not sure what is happening there.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 16997
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

I need you to file a bug report on this too please http://bugs.cacti.net

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
brylant
Posts: 36
Joined: Mon Aug 17, 2009 9:05 am

Post by brylant »

well... I'm not sure if I have several thousand ports opened - the system is only running cacti (so just spine, mysql, apache and ntpd)! There's Total: 10001 items (5033 datasources).
The box is 2x quad core xeon, it only takes ~20 secs to complete the cycle. I've got 2 processes, 7 threads/process, 4 php servers...
I don't really think it should be failing...
Any ideas...?

adam

BTW - issue 0001544 opened...
User avatar
TheWitness
Developer
Posts: 16997
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Read this. Check your ulimits....
The return value from socket is the file descriptor for the new socket, or -1 in case of error. The following errno error conditions are defined for this function:

EPROTONOSUPPORT
The protocol or style is not supported by the namespace specified.

EMFILE
The process already has too many file descriptors open.

ENFILE
The system already has too many file descriptors open.

EACCESS
The process does not have privilege to create a socket of the specified style or protocol.

ENOBUFS
The system ran out of internal buffer space.
I am thinking either ENFILE or ENOBUFS. If its ENOBUFS, then we can make it better by adding a slight delay. Check that first and get back to me.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
brylant
Posts: 36
Joined: Mon Aug 17, 2009 9:05 am

Post by brylant »

yep... now I feel a little bit embarrassed... ulimit -n was 1024 ;-)
I've changed it to something more appropriate for my environment (99k), also changed sysctl fs.file-max. Hopefully that will solve the issue.
I've added some more logging to ping.c so that I'll get errno when (if) it fails again...
Interestingly - I've changed most of ping checks to UDP ping (can't change them all as some boxes won't respond properly to udp ping) and for the past 24 hours I haven't seen the problem again... I'll change it back to snmp to see if ulimit change helped. But that's all on Monday as I don't want those scary holes in the graphs over the weekend ;-)
Thanks for help - I'll be back on Monday/Tuesday ;-)
User avatar
TheWitness
Developer
Posts: 16997
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

If you want to upload your patches to ping.c to the bug, I will review and merge if applicable.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
ABX
Cacti User
Posts: 68
Joined: Thu Mar 01, 2007 5:55 am

Re: random: ERROR: ping_icmp: cannot open an ICMP socket

Post by ABX »

Cacti: 0.8.7g on Linux here

I was getting the same errors, solved by increasing the open files limit whit ulimit.

Edit: The problem came back after a couple of days, but a spine update solved everything. 1 month now, clear of any problem.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest