Spine Seg fault
Moderators: Developers, Moderators
Spine Seg fault
Hi,
I was wondering if someone could help me troubleshoot this issue.
I just added a new host to cacti. It's a VOIP box with non-standard snmp settings.
We had configured the device along with data sources etc before realizing that the snmp box wasn't permitting our ip (duh). However when we added the ip to the allow list, spine immediately stared segfaulting.
I ran a poller cycle in debug mode, and found this:
03/20/2009 10:55:06 AM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
03/20/2009 10:55:06 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 30
03/20/2009 10:55:06 AM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
03/20/2009 10:55:06 AM - SPINE: Poller[0] Host[157] PING: Result ICMP: Host is Alive
03/20/2009 10:55:06 AM - SPINE: Poller[0] Host[157] RECACHE: Processing 1 items in the auto reindex cache for 'IP_ADDR_OF_VOIP_BOX'
03/20/2009 10:55:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
I Disabled the new VOIP box in cacti and the problem goes away.
OS: CentOS 5.2
Cacti version: 0.8.7d
Spine version: 0.8.7c
netsnmp version: net-snmp-5.3.1-24.el5_2.2
I couldn't try with cmd.php as we have too many hosts (takes too long)
The snmp settings I mentioned are:
snmp version 1
snmp port 162
custom OID's .1.3.6.1.4.1.25060.1.1 and .1.3.6.1.4.1.25060.1.6
a manual snmpget seems to works fine:
# snmpget -v1 -c OUR_COMMUNITY OUR_IP:162 1.3.6.1.4.1.25060.1.6
SNMPv2-SMI::enterprises.25060.1.6 = INTEGER: 9
Is there any way of narrowing the problem further?
Thanks
I was wondering if someone could help me troubleshoot this issue.
I just added a new host to cacti. It's a VOIP box with non-standard snmp settings.
We had configured the device along with data sources etc before realizing that the snmp box wasn't permitting our ip (duh). However when we added the ip to the allow list, spine immediately stared segfaulting.
I ran a poller cycle in debug mode, and found this:
03/20/2009 10:55:06 AM - SPINE: Poller[0] DEBUG: Valid Thread to be Created
03/20/2009 10:55:06 AM - SPINE: Poller[0] DEBUG: The Value of Active Threads is 30
03/20/2009 10:55:06 AM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
03/20/2009 10:55:06 AM - SPINE: Poller[0] Host[157] PING: Result ICMP: Host is Alive
03/20/2009 10:55:06 AM - SPINE: Poller[0] Host[157] RECACHE: Processing 1 items in the auto reindex cache for 'IP_ADDR_OF_VOIP_BOX'
03/20/2009 10:55:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
I Disabled the new VOIP box in cacti and the problem goes away.
OS: CentOS 5.2
Cacti version: 0.8.7d
Spine version: 0.8.7c
netsnmp version: net-snmp-5.3.1-24.el5_2.2
I couldn't try with cmd.php as we have too many hosts (takes too long)
The snmp settings I mentioned are:
snmp version 1
snmp port 162
custom OID's .1.3.6.1.4.1.25060.1.1 and .1.3.6.1.4.1.25060.1.6
a manual snmpget seems to works fine:
# snmpget -v1 -c OUR_COMMUNITY OUR_IP:162 1.3.6.1.4.1.25060.1.6
SNMPv2-SMI::enterprises.25060.1.6 = INTEGER: 9
Is there any way of narrowing the problem further?
Thanks
- TheWitness
- Developer
- Posts: 17059
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
I'm checking in a new SVN. I reviewed the code and see what might be a few corner cases. Please test tomorrow.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Thanks for the tip! I didn't know one could run spine manually on a single host.
With the host disabled, spine didn't segfault, though it didn't really do anything (expected behavior).
I reactivated the host, and spine segfaulted again. (This is still with 0.8.7c)
Verbosity 5 didn't seem to add any new information:
I ran an strace on the spine process and got the following:
So it seems to be segfaulting right after recvmsg
I checked out the latest SVN and it's segfaulting at the exact same spot.
Also, admitiedly this is the first time I've used SVN, but i had some problems compiling spine.
I had to overite the following files with versions from spine-0.8.7c, as they were generating errors. in all 3 cases the files were calling "Link" but with only one argument. Link takes two arguments. Am I doing something wrong?
config/config.sub
-------
configure: error: cannot run /bin/sh config/config.sub
#/bin/sh config/config.sub
link: missing operand after `/usr/share/libtool/config.sub'
config/config.guess
-------
#./configure
checking build system type... link: missing operand after `/usr/share/libtool/config.guess'
(configure finished sucessfully after replacing these two files)
libtool
-------
#make
........
/bin/sh ./libtool --tag=CC --mode=link gcc -I/usr/include/net-snmp -I/usr/include/net-snmp/.. -I/usr/include/mysql -g -O2 -L/usr/lib -L/usr/lib/mysql -o spine sql.o spine.o util.o snmp.o locks.o poller.o nft_popen.o php.o ping.o keywords.o error.o -lnetsnmp -lmysqlclient_r -lmysqlclient_r -lcrypto -lz -lpthread -lm
link: missing operand after `/usr/share/libtool/ltmain.sh'
After replacing these three files with versions from 0.8.7c spine compiled fine.
Thanks for looking at this, I REALLY appreciate it!
With the host disabled, spine didn't segfault, though it didn't really do anything (expected behavior).
I reactivated the host, and spine segfaulted again. (This is still with 0.8.7c)
Verbosity 5 didn't seem to add any new information:
Code: Select all
DEBUG: In Poller, About to Start Polling of Host
Host[0] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
DEBUG: The Value of Active Threads is 1
Host[157] PING: Result ICMP: Host is Alive
Host[157] RECACHE: Processing 1 items in the auto reindex cache for 'OURIP'
FATAL: Spine Encountered a Segmentation Fault (Spine thread)
I ran an strace on the spine process and got the following:
Code: Select all
[pid 11697] write(6, "R\1\0\0\3UPDATE host SET status=\'3\',"..., 342) = 342
[pid 11697] read(6, "0\0\0\1\0\1\0\2\0\1\0(Rows matched: 1 Cha"..., 16384) = 52
[pid 11697] poll([{fd=6, events=POLLIN|POLLPRI}], 1, 0) = 0
[pid 11697] write(6, "[\0\0\0\3SELECT data_query_id, actio"..., 95) = 95
[pid 11697] read(6, "\1\0\0\1\5U\0\0\2\3def\tcacti_new\16poller_r"..., 16384) = 443
[pid 11697] sendmsg(7, {msg_name(16)={sa_family=AF_INET, sin_port=htons(162), sin_addr=inet_addr("OURIP")}, msg_iov(1)=[{"0,\2\1\0\4\tOURCOMMUNITY\240\34\2\4sviP\2\1\0\2\1\0000\16"..., 46}], msg_controllen=24, {cmsg_len=24, cmsg_level=SOL_IP, cmsg_type=, ...}, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 46
[pid 11697] gettimeofday({1238069175, 564821}, NULL) = 0
[pid 11697] gettimeofday({1238069175, 564905}, NULL) = 0
[pid 11697] select(8, [7], NULL, NULL, {0, 499916}) = 1 (in [7], left {0, 499916})
[pid 11697] recvmsg(7, {msg_name(16)={sa_family=AF_INET, sin_port=htons(162), sin_addr=inet_addr("OURIP")}, msg_iov(1)=[{"0\36\2\1\0\4\tOURCOMMUNITY\242\16\2\4sviP\2\1\0\2\1\0000\0"..., 65536}], msg_controllen=0, msg_flags=0}, 0) = 32
[pid 11697] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
[pid 11697] rt_sigaction(SIGSEGV, {SIG_DFL}, {0x8054a80, [SEGV], SA_RESTART}, 8) = 0
[pid 11697] write(1, "FATAL: Spine Encountered a Segme"..., 61FATAL: Spine Encountered a Segmentation Fault (Spine thread)
) = 61
I checked out the latest SVN and it's segfaulting at the exact same spot.
Also, admitiedly this is the first time I've used SVN, but i had some problems compiling spine.
I had to overite the following files with versions from spine-0.8.7c, as they were generating errors. in all 3 cases the files were calling "Link" but with only one argument. Link takes two arguments. Am I doing something wrong?
config/config.sub
-------
configure: error: cannot run /bin/sh config/config.sub
#/bin/sh config/config.sub
link: missing operand after `/usr/share/libtool/config.sub'
config/config.guess
-------
#./configure
checking build system type... link: missing operand after `/usr/share/libtool/config.guess'
(configure finished sucessfully after replacing these two files)
libtool
-------
#make
........
/bin/sh ./libtool --tag=CC --mode=link gcc -I/usr/include/net-snmp -I/usr/include/net-snmp/.. -I/usr/include/mysql -g -O2 -L/usr/lib -L/usr/lib/mysql -o spine sql.o spine.o util.o snmp.o locks.o poller.o nft_popen.o php.o ping.o keywords.o error.o -lnetsnmp -lmysqlclient_r -lmysqlclient_r -lcrypto -lz -lpthread -lm
link: missing operand after `/usr/share/libtool/ltmain.sh'
After replacing these three files with versions from 0.8.7c spine compiled fine.
Thanks for looking at this, I REALLY appreciate it!
- TheWitness
- Developer
- Posts: 17059
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Can you deploy 0.8.7d-pre2 and test again. It can be found in the announcement forum. Then dual post for those in both threads.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
- TheWitness
- Developer
- Posts: 17059
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Can you do a gotomeeting in the am? That is around 7:00 EST GMT-5?mxtommy wrote:By dual post I assume you mean I should update both threads, if not then sorry in advance
There's no change with pre2. It still segfaults at the same spot.
Thanks, Thomas
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
- TheWitness
- Developer
- Posts: 17059
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Monday wont work. How about Sunday?
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
- TheWitness
- Developer
- Posts: 17059
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Agreed, watch your email. Its dinner time so you won't see anything till sometime Saturday afternoon.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
- TheWitness
- Developer
- Posts: 17059
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Ok, mxtommy and I resolved his problem. It was interesting. An snmp enabled host, that responsed to a sysUptime request, with no snmp errors, but a null response (blank string).
So, I have revised snmp.c to handle this corner case. I will post here and this will be either in the release code or the pre3, whichever I decide to do.
NOTE: I STILL NEED AN SNMPV3 TESTER! One who can do an online session as was done with mxtommy today.
Thanks Thomas!!
TheWitness
So, I have revised snmp.c to handle this corner case. I will post here and this will be either in the release code or the pre3, whichever I decide to do.
NOTE: I STILL NEED AN SNMPV3 TESTER! One who can do an online session as was done with mxtommy today.
Thanks Thomas!!
TheWitness
- Attachments
-
- snmp.c
- (20.38 KiB) Downloaded 438 times
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
-
- Posts: 32
- Joined: Fri Jan 19, 2007 2:11 pm
I am having a similar problem, if folks are still reading this thread I won't hafta create a new one.
I am running spine and having segfaults as well, though my cacti log doesn't show it in the way the original poster does. Some of my graphs are updated every minute, others are updated every five.
The seg faults occur every 5 minutes and it doesn't update my graphs while this happens.
And when this happens, the cacti log just repeats every second:
This is the only warning in the cacti log relating to this.
Is this a related issue? Any ideas??
soloslinger
I am running spine and having segfaults as well, though my cacti log doesn't show it in the way the original poster does. Some of my graphs are updated every minute, others are updated every five.
The seg faults occur every 5 minutes and it doesn't update my graphs while this happens.
Code: Select all
This GDB was configured as "i386-marcel-freebsd".
Core was generated by `spine'.
Program terminated with signal 11, Segmentation fault.
#0 0x282c7537 in ?? ()
Code: Select all
03/31/2009 05:49:59 AM - SYSTEM STATS: Time:118.2700 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:342 RRDsProcessed:156
03/31/2009 05:49:59 AM - SYSTEM STATS: Time:0.3631 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:342 RRDsProcessed:49
03/31/2009 05:50:03 AM - SYSTEM STATS: Time:2.6043 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:420 RRDsProcessed:208
03/31/2009 05:51:04 AM - SYSTEM STATS: Time:3.5402 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:420 RRDsProcessed:170
03/31/2009 05:52:05 AM - SYSTEM STATS: Time:4.5735 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:420 RRDsProcessed:175
03/31/2009 05:54:59 AM - SYSTEM STATS: Time:118.2402 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:420 RRDsProcessed:156
03/31/2009 05:54:59 AM - SYSTEM STATS: Time:0.3494 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:420 RRDsProcessed:49
03/31/2009 05:55:03 AM - SYSTEM STATS: Time:2.6377 Method:spine Processes:2 Threads:1 Hosts:36 HostsPerProcess:18 DataSources:338 RRDsProcessed:209
Code: Select all
03/31/2009 01:59:38 AM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select poller_id,end_time from poller_time where poller_id=0"
03/31/2009 01:59:38 AM - CMDPHP: Poller[0] DEBUG: SQL Assoc: "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
Code: Select all
03/31/2009 05:56:00 AM - POLLER: Poller[0] WARNING: Poller Output Table not Empty. Potential Data Source Issues for Data Sources: traffic_in(DS[746]), traffic_out(DS[746]), traffic_in(DS[747]), traffic_out(DS[747]), traffic_in(DS[777]), traffic_out(DS[777]), traffic_in(DS[778]), traffic_out(DS[778]), temp(DS[796]), temp(DS[801]), traffic_in(DS[816]), traffic_out(DS[816]), traffic_in(DS[843]), traffic_out(DS[843]), traffic_in(DS[853]), traffic_out(DS[853]), traffic_in(DS[866]), traffic_out(DS[866]), traffic_in(DS[871]), traffic_out(DS[871]), traffic_in(DS[875]), traffic_out(DS[875]), traffic_in(DS[876]), traffic_out(DS[876]), traffic_in(DS[877]), traffic_out(DS[877])
soloslinger
plugin: 2.1
cacti: 0.8.7b
apache 2.0
php: 5.2.5
mysql: 5.1
os: freebsd 6.2
cacti: 0.8.7b
apache 2.0
php: 5.2.5
mysql: 5.1
os: freebsd 6.2
Who is online
Users browsing this forum: No registered users and 4 guests