NIFTY POPEN timed out
Moderators: Developers, Moderators
NIFTY POPEN timed out
Hi all,
I'm seeing an error in my cacti log:
03/31/2011 07:30:56 PM - SPINE: Poller[0] Host[1] ERROR: The NIFTY POPEN timed out
When this happens I see bogus data put into the rra for that poll. The data query is running a script which I believe is returning NaN or no value when I see this error.
I'm running Spine 0.8.7g with all the latest patches. Cacti version is also 0.8.7g.
Has anyone seen this error? Any ideas as to what to do about it?
Thanks,
Jackie
I'm seeing an error in my cacti log:
03/31/2011 07:30:56 PM - SPINE: Poller[0] Host[1] ERROR: The NIFTY POPEN timed out
When this happens I see bogus data put into the rra for that poll. The data query is running a script which I believe is returning NaN or no value when I see this error.
I'm running Spine 0.8.7g with all the latest patches. Cacti version is also 0.8.7g.
Has anyone seen this error? Any ideas as to what to do about it?
Thanks,
Jackie
- rony
- Developer/Forum Admin
- Posts: 6022
- Joined: Mon Nov 17, 2003 6:35 pm
- Location: Michigan, USA
- Contact:
Re: NIFTY POPEN timed out
This is usually an indication that you have your number of polling processes set to high or do not have enough file handlers open on your system.
What are you poller settings? Settings -> Poller -> Spine Specific Execution Parameters
What are you poller settings? Settings -> Poller -> Spine Specific Execution Parameters
[size=117][i][b]Tony Roman[/b][/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
[size=84][i]Experience is what causes a person to make new mistakes instead of old ones.[/i][/size]
[size=84][i]There are only 3 way to complete a project: Good, Fast or Cheap, pick two.[/i][/size]
[size=84][i]With age comes wisdom, what you choose to do with it determines whether or not you are wise.[/i][/size]
Re: NIFTY POPEN timed out
Thanks for your response. My settings:
Maximum Concurrent Poller Processes: 2
Maximum Threads per Process: 4
Number of PHP Script Servers: 5
Script and Script Server Timeout Value: 50
Maximum SNMP OIDs per SNMP Get Request: 10
I've adjusted the Script and Script Server timeout value from 25 to 50. And I did see the error timestamp go from 28 seconds after the minute time to 56 seconds after the minute time, so I think that timeout value is coming into play. If a timeout is actually occurring, wouldn't cacti put a NaN in for the value? That's not what I'm seeing.
Thanks! Jackie
Maximum Concurrent Poller Processes: 2
Maximum Threads per Process: 4
Number of PHP Script Servers: 5
Script and Script Server Timeout Value: 50
Maximum SNMP OIDs per SNMP Get Request: 10
I've adjusted the Script and Script Server timeout value from 25 to 50. And I did see the error timestamp go from 28 seconds after the minute time to 56 seconds after the minute time, so I think that timeout value is coming into play. If a timeout is actually occurring, wouldn't cacti put a NaN in for the value? That's not what I'm seeing.
Thanks! Jackie
Re: NIFTY POPEN timed out
Here are a bit more details of what I'm experiencing. In the cacti.log file I see:
04/03/2011 12:20:55 AM - SPINE: Poller[0] Host[1] ERROR: The NIFTY POPEN timed out
04/03/2011 05:05:56 AM - SPINE: Poller[0] Host[1] ERROR: The NIFTY POPEN timed out
And the data I see in the rra is messed up. The readings on this data shouldn't vary much, but as you can see the exponent value is wrong when these errors occur. This the output from rrdtool dump:
<!-- 2011-04-03 00:15:00 MDT / 1301811300 --> <row><v>6.2942200000e+01</v></row>
<!-- 2011-04-03 00:20:00 MDT / 1301811600 --> <row><v>6.2940000000e-01</v></row>
<!-- 2011-04-03 00:25:00 MDT / 1301811900 --> <row><v>6.3528300000e+01</v></row>
<!-- 2011-04-03 05:00:00 MDT / 1301828400 --> <row><v>6.6091466667e+01</v></row>
<!-- 2011-04-03 05:05:00 MDT / 1301828700 --> <row><v>6.6090000000e-01</v></row>
<!-- 2011-04-03 05:10:00 MDT / 1301829000 --> <row><v>6.5538000000e+01</v></row>
Thanks for any ideas,
Jackie
04/03/2011 12:20:55 AM - SPINE: Poller[0] Host[1] ERROR: The NIFTY POPEN timed out
04/03/2011 05:05:56 AM - SPINE: Poller[0] Host[1] ERROR: The NIFTY POPEN timed out
And the data I see in the rra is messed up. The readings on this data shouldn't vary much, but as you can see the exponent value is wrong when these errors occur. This the output from rrdtool dump:
<!-- 2011-04-03 00:15:00 MDT / 1301811300 --> <row><v>6.2942200000e+01</v></row>
<!-- 2011-04-03 00:20:00 MDT / 1301811600 --> <row><v>6.2940000000e-01</v></row>
<!-- 2011-04-03 00:25:00 MDT / 1301811900 --> <row><v>6.3528300000e+01</v></row>
<!-- 2011-04-03 05:00:00 MDT / 1301828400 --> <row><v>6.6091466667e+01</v></row>
<!-- 2011-04-03 05:05:00 MDT / 1301828700 --> <row><v>6.6090000000e-01</v></row>
<!-- 2011-04-03 05:10:00 MDT / 1301829000 --> <row><v>6.5538000000e+01</v></row>
Thanks for any ideas,
Jackie
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: NIFTY POPEN timed out
The output is indeed wrong. That's how rrdtool copes with "missing" data in specific cases. But this dump does not add more information, other than "there's no data available". It does not tell us, why the script times out. It may happen, that your script uses some external commands that do not have any timeouts or have insane timeouts
R.
R.
Re: NIFTY POPEN timed out
Thanks for your response. I didn't realize rrdtool would be adjusting the data in the rra. I expected to see NaNs in there.
What I find confusing is that I have another instance of cacti running which is polling the same script. It is version 0.8.7e of cacti and spine. I see some popen timeouts, but I don't see invalid data in the rra, and the graph doesn't have any glitches. Is there something that changed between 0.8.7e and 0.8.7g that would cause this? The rrdtool version is the same for both instances.
Thanks,
Jackie
What I find confusing is that I have another instance of cacti running which is polling the same script. It is version 0.8.7e of cacti and spine. I see some popen timeouts, but I don't see invalid data in the rra, and the graph doesn't have any glitches. Is there something that changed between 0.8.7e and 0.8.7g that would cause this? The rrdtool version is the same for both instances.
Thanks,
Jackie
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: NIFTY POPEN timed out
We had some fixes regarding to POPEN. I'm no expert on this, to be honest. TheWitness is THE one and only ...
R.
R.
Re: NIFTY POPEN timed out
Well, I don't understand it, but I may have found a solution. I changed the SNMP timeout value to 10,000 (10 seconds) under the device configuration menu, and I haven't seen any glitches in the graph since. I noticed this was different between my 0.8.7e and 0.8.7g versions of cacti. I'll keep watching, but that seems to have done the trick.
Thanks,
Jackie
Thanks,
Jackie
Re: NIFTY POPEN timed out
didn't work out for me and what annoys me is that ERROR: The NIFTY POPEN timed out message is given randomly for random hosts of different OS.jackie wrote:Well, I don't understand it, but I may have found a solution. I changed the SNMP timeout value to 10,000 (10 seconds) under the device configuration menu, and I haven't seen any glitches in the graph since. I noticed this was different between my 0.8.7e and 0.8.7g versions of cacti. I'll keep watching, but that seems to have done the trick.
Thanks,
Jackie
also, randomly i get POLLER: Poller[0] WARNING: There are '1' detected as overrunning a polling process, please investigate and POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting. but i can't figure out what it is.
Code: Select all
Cacti Version 0.8.8a
Cacti OS unix
SNMP Version NET-SNMP version: 5.4.2.1
RRDTool Version RRDTool 1.3.x
Hosts 284
Graphs 2968
Data Sources Script/Command: 476
SNMP: 2159
SNMP Query: 1052
Script - Script Server (PHP): 1
Script Query - Script Server: 12
Total: 3700
Poller Information
Interval 60
Type SPINE 0.8.8a Copyright 2002-2012 by The Cacti Group
Items Action[0]: 4049
Action[1]: 474
Action[2]: 21
Total: 4544
Concurrent Processes 4
Max Threads 30
PHP Servers 2
Script Timeout 40
Max OID 5
Last Run Statistics Time:52.8706 Method:spine Processes:4 Threads:30 Hosts:283 HostsPerProcess:71 DataSources:4525 RRDsProcessed:3652
PHP Information
PHP Version 5.3.2-1ubuntu4.18
PHP OS Linux
PHP uname Linux Ifra-Mngmnt 2.6.32-45-generic #102-Ubuntu SMP Wed Jan 2 22:38:04 UTC 2013 x86_64
PHP SNMP Installed
max_execution_time 50
memory_limit 256M
cron at 5 min
max_connections=1000 (mysql)
all on a quad core VM @ 2GHz and 4G od ram.
plugins: clog, reportit, aggregate, dashboard, monitor, macktrack, docs, errorimage, flowview, settings, ipsubnet, mikrotik, rrdclean, superlinks, titlechanger, thold, watermark, weathermap
Who is online
Users browsing this forum: No registered users and 1 guest