Find out on which thread spine hangs?
Moderators: Developers, Moderators
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
Find out on which thread spine hangs?
Hi,
I've a problem with spine 0.8.7. One of it's threads hangs and this causes to loose some data about devices who are after this device on which it hangs. When I run spine with only one thread, then it does not hang.
It's some kind of mystic thing and I can't find a way to properly debug it.
Any ideas?
Thanks a lot!
I've a problem with spine 0.8.7. One of it's threads hangs and this causes to loose some data about devices who are after this device on which it hangs. When I run spine with only one thread, then it does not hang.
It's some kind of mystic thing and I can't find a way to properly debug it.
Any ideas?
Thanks a lot!
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Post upper section of Technical Support page.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
Sorry for not doing this at the beginning. Tried it with spine SPINE 0.8.7c-beta3 and SPINE 0.8.7c. However currently I use more SPINE 0.8.7c-beta3, because it seems to hang less and making the error "Timed out while processing hosts internal".
My monitored network sometimes likes to be quite unstable because there are lot of radio devices spread along a big area. I wonder if there could be a way to find which exactly device causes it to hang.
Thanks
My monitored network sometimes likes to be quite unstable because there are lot of radio devices spread along a big area. I wonder if there could be a way to find which exactly device causes it to hang.
Thanks
- Attachments
-
- cacti.JPG (123.52 KiB) Viewed 3484 times
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Recompile spine using nifty popen and see if that takes care of it. I guarentee it's the scripts. I think Nifty might fix it.
If you run ./configure --help, you will find the option. Please post your findings.
TheWitness
If you run ./configure --help, you will find the option. Please post your findings.
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
Hi,
worked on that problem a bit - recompiled spine with nifty popen.
Unfortunately this does not help.
I think it's not about scripts at all, at least according to my current findings-
I found one device while grepping the logs on the interval spine uses to hang. It had UDP availability ping setting and due to bad link to it the availability time is less than 20%. What I did is I changed it's ping to ICMP and at least for a few hours the problem seems to be gone. What I don't understand is if I take another device with low availability and set it to ping UDP it does not hang spine. It does not use any external scripts or anything - just makes two graphs about traffic usage.
trying to poll this from console with:
./spine -V 5 -f 2399 -l 2400 (only two hosts, 2399 is the on which spine hangs)
output:
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 1206509.8286
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 12.1593
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 8.1062
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 10.0136
03/14/2009 11:43:31 PM - SPINE: Poller[0] Host[2399] PING Result: UDP: Ping timed out
03/14/2009 11:43:31 PM - SPINE: Poller[0] Host[2399] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
and spine quits to console - everything is great
now if I do it like this:
./spine -V 5 -f 2399 -l 2410 (add a few more hosts to the process)
it returns log file about all other hosts, but nothing about 2399 - no ping timed out, no nothing and does not quit to console - it just hangs.
just to be sure it is this device's fault I try
./spine -V 5 -f 2400 -l 2410 (exclude the suspected device)
it does everything in a blink of an eye and returns back to console
Version:
SPINE 0.8.7c
it was set to run two threads only doing these tests, but making it more does not save the situation.
Kind of strange situation - maybe you have any ideas what could cause that?
worked on that problem a bit - recompiled spine with nifty popen.
Unfortunately this does not help.
I think it's not about scripts at all, at least according to my current findings-
I found one device while grepping the logs on the interval spine uses to hang. It had UDP availability ping setting and due to bad link to it the availability time is less than 20%. What I did is I changed it's ping to ICMP and at least for a few hours the problem seems to be gone. What I don't understand is if I take another device with low availability and set it to ping UDP it does not hang spine. It does not use any external scripts or anything - just makes two graphs about traffic usage.
trying to poll this from console with:
./spine -V 5 -f 2399 -l 2400 (only two hosts, 2399 is the on which spine hangs)
output:
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 1206509.8286
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 12.1593
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 8.1062
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 10.0136
03/14/2009 11:43:31 PM - SPINE: Poller[0] Host[2399] PING Result: UDP: Ping timed out
03/14/2009 11:43:31 PM - SPINE: Poller[0] Host[2399] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
and spine quits to console - everything is great
now if I do it like this:
./spine -V 5 -f 2399 -l 2410 (add a few more hosts to the process)
it returns log file about all other hosts, but nothing about 2399 - no ping timed out, no nothing and does not quit to console - it just hangs.
just to be sure it is this device's fault I try
./spine -V 5 -f 2400 -l 2410 (exclude the suspected device)
it does everything in a blink of an eye and returns back to console
Version:
SPINE 0.8.7c
it was set to run two threads only doing these tests, but making it more does not save the situation.
Kind of strange situation - maybe you have any ideas what could cause that?
Last edited by dainiookas on Sat Mar 14, 2009 6:06 pm, edited 1 time in total.
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
How many threads. I'm working on spine right now. Do either do snmp? Also, try again with the SVN spine please. I made a few, maybe non-related, changes. Let's start from there.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
Only two threads were used during this test, but more of them give the same effect.
Will test SVN spine tomorrow.
I'm sorry in the previous post I wrote like this: "What I did is I changed it's ping to UDP and at least fo..."
actually I meant: "What I did is I changed it's ping to ICMP and at least fo"
Corrected it in the real message
Will test SVN spine tomorrow.
I'm sorry in the previous post I wrote like this: "What I did is I changed it's ping to UDP and at least fo..."
actually I meant: "What I did is I changed it's ping to ICMP and at least fo"
Corrected it in the real message
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Will you have tim to look at this in a few hours, together that is?
Larry
Larry
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 34
- Joined: Fri Dec 05, 2008 5:49 am
- Location: Vilnius, Lithuania
Who is online
Users browsing this forum: No registered users and 1 guest