Find out on which thread spine hangs?

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Find out on which thread spine hangs?

Post by dainiookas »

Hi,
I've a problem with spine 0.8.7. One of it's threads hangs and this causes to loose some data about devices who are after this device on which it hangs. When I run spine with only one thread, then it does not hang.
It's some kind of mystic thing and I can't find a way to properly debug it.
Any ideas?

Thanks a lot!
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Post upper section of Technical Support page.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Post by dainiookas »

Sorry for not doing this at the beginning. Tried it with spine SPINE 0.8.7c-beta3 and SPINE 0.8.7c. However currently I use more SPINE 0.8.7c-beta3, because it seems to hang less and making the error "Timed out while processing hosts internal".

My monitored network sometimes likes to be quite unstable because there are lot of radio devices spread along a big area. I wonder if there could be a way to find which exactly device causes it to hang.

Thanks
Attachments
cacti.JPG
cacti.JPG (123.52 KiB) Viewed 3477 times
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Post by dainiookas »

Just in case wanted to tell that 80 processes is made just for now because it's a production server and this kind of configuration helps to work with it.
Previously I used 6 processes and 18 threads on each.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Recompile spine using nifty popen and see if that takes care of it. I guarentee it's the scripts. I think Nifty might fix it.

If you run ./configure --help, you will find the option. Please post your findings.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Post by dainiookas »

Hi,
worked on that problem a bit - recompiled spine with nifty popen.
Unfortunately this does not help.

I think it's not about scripts at all, at least according to my current findings-
I found one device while grepping the logs on the interval spine uses to hang. It had UDP availability ping setting and due to bad link to it the availability time is less than 20%. What I did is I changed it's ping to ICMP and at least for a few hours the problem seems to be gone. What I don't understand is if I take another device with low availability and set it to ping UDP it does not hang spine. It does not use any external scripts or anything - just makes two graphs about traffic usage.

trying to poll this from console with:
./spine -V 5 -f 2399 -l 2400 (only two hosts, 2399 is the on which spine hangs)

output:
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 1206509.8286
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 12.1593
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 8.1062
03/14/2009 11:43:31 PM - SPINE: Poller[0] DEBUG: UDP Ping return_code was -10, errno was 0, total_time was 10.0136
03/14/2009 11:43:31 PM - SPINE: Poller[0] Host[2399] PING Result: UDP: Ping timed out
03/14/2009 11:43:31 PM - SPINE: Poller[0] Host[2399] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function

and spine quits to console - everything is great

now if I do it like this:
./spine -V 5 -f 2399 -l 2410 (add a few more hosts to the process)
it returns log file about all other hosts, but nothing about 2399 - no ping timed out, no nothing and does not quit to console - it just hangs.

just to be sure it is this device's fault I try
./spine -V 5 -f 2400 -l 2410 (exclude the suspected device)
it does everything in a blink of an eye and returns back to console

Version:
SPINE 0.8.7c
it was set to run two threads only doing these tests, but making it more does not save the situation.

Kind of strange situation - maybe you have any ideas what could cause that?
Last edited by dainiookas on Sat Mar 14, 2009 6:06 pm, edited 1 time in total.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

How many threads. I'm working on spine right now. Do either do snmp? Also, try again with the SVN spine please. I made a few, maybe non-related, changes. Let's start from there.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Post by dainiookas »

Only two threads were used during this test, but more of them give the same effect.
Will test SVN spine tomorrow.

I'm sorry in the previous post I wrote like this: "What I did is I changed it's ping to UDP and at least fo..."

actually I meant: "What I did is I changed it's ping to ICMP and at least fo"

Corrected it in the real message
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Post by dainiookas »

Tested: SPINE 0.8.8-alpha (didn't change the name to pre-d)

Absolutely the same situation repeats - as fast as I set downed device to UDP ping, thread hangs.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Will you have tim to look at this in a few hours, together that is?
Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
dainiookas
Posts: 34
Joined: Fri Dec 05, 2008 5:49 am
Location: Vilnius, Lithuania

Post by dainiookas »

I think I will when exactly do you want?
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests