Spine 0.8.7d-pre2 Available for Testing

Important information about Cacti developments that all users should be interested in.

Moderators: Developers, Moderators

User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Spine 0.8.7d-pre2 Available for Testing

Post by TheWitness »

All,

I have provided a pre-release of Spine 0.8.7d for user preview prior to release. There are multiple bug fixes in this release, related to pinging of hosts and other general performance issues.

I would like to target users of snmpv3 in this pre-release and users who periodically encounter script timeout issues. This release should correct the snmpv3 issues that multiple users have been experiencing and also provide a workaround for script timeouts.

For users who experience script timeouts, I would be interested to see if that after spine exits, your child script processes exit as well. If they stay running and the PGID becomes 1, I would like to also know. The intent is that they terminate with thier spine parent. If you encounter a PGID becoming 1, please update the post below, and then rerun configure with the --enable-nifty-popen option to resolve this issue.

TheWitness
Attachments
cacti-spine-0.8.7d-pre2.tar.gz
Linux/UNIX Source
(708.22 KiB) Downloaded 3397 times
cacti-spine-0.8.7d-pre2.zip
Windows Source
(810.1 KiB) Downloaded 3288 times
Last edited by TheWitness on Thu Apr 09, 2009 7:58 pm, edited 4 times in total.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
skinty
Posts: 41
Joined: Sat Aug 12, 2006 6:47 pm

My results so far

Post by skinty »

Code: Select all

03/25/2009 09:27:32 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:27:14 AM - SYSTEM STATS: Time:132.3853 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:9269
03/25/2009 09:27:09 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:59 AM - SPINE: Poller[0] Host[268] ERROR: Empty result [24.97.168.219]: '/usr/bin/php -q /usr/share/cacti/scripts/query_host_isdncalls.php 24.97.168.219 feedme 2 get calls_out 0'
03/25/2009 09:26:50 AM - SPINE: Poller[0] Host[268] ERROR: Empty result [24.97.168.219]: '/usr/bin/php -q /usr/share/cacti/scripts/query_host_isdncalls.php 24.97.168.219 feedme 2 get calls_in 0'
03/25/2009 09:26:41 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:33 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:26:31 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:21 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:11 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:26:01 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:25:58 AM - SPINE: Poller[0] Host[408] ERROR: The POPEN timed out
03/25/2009 09:25:51 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:25:46 AM - SPINE: Poller[0] Host[529] ERROR: The POPEN timed out
03/25/2009 09:25:41 AM - SPINE: Poller[0] Host[268] ERROR: The POPEN timed out
03/25/2009 09:25:36 AM - SPINE: Poller[0] Host[477] ERROR: The POPEN timed out
03/25/2009 09:25:36 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:25:31 AM - SPINE: Poller[0] Host[559] ERROR: The POPEN timed out
03/25/2009 09:25:30 AM - SPINE: Poller[0] Host[467] ERROR: The POPEN timed out
03/25/2009 09:25:30 AM - SPINE: Poller[0] Host[465] ERROR: The POPEN timed out
03/25/2009 09:25:29 AM - SPINE: Poller[0] Host[427] ERROR: The POPEN timed out
03/25/2009 09:25:29 AM - SPINE: Poller[0] Host[429] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[516] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[515] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[518] ERROR: The POPEN timed out
03/25/2009 09:25:28 AM - SPINE: Poller[0] Host[506] ERROR: The POPEN timed out
03/25/2009 09:25:27 AM - SPINE: Poller[0] Host[563] ERROR: The POPEN timed out
03/25/2009 09:25:27 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:25:27 AM - SPINE: Poller[0] Host[508] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[557] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[426] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[499] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[237] ERROR: The POPEN timed out
03/25/2009 09:25:26 AM - SPINE: Poller[0] Host[481] ERROR: The POPEN timed out
03/25/2009 09:25:23 AM - SPINE: Poller[0] Host[468] ERROR: The POPEN timed out
03/25/2009 09:25:23 AM - SPINE: Poller[0] Host[470] ERROR: The POPEN timed out
03/25/2009 09:25:20 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:25:16 AM - SPINE: Poller[0] Host[217] ERROR: The POPEN timed out
03/25/2009 09:25:16 AM - SPINE: Poller[0] Host[216] ERROR: The POPEN timed out
03/25/2009 09:25:12 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:25:12 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:20:33 AM - SPINE: Poller[0] Host[398] ERROR: The POPEN timed out
03/25/2009 09:20:28 AM - SPINE: Poller[0] Host[425] ERROR: The POPEN timed out
03/25/2009 09:20:26 AM - SPINE: Poller[0] Host[382] ERROR: The POPEN timed out
03/25/2009 09:20:25 AM - SPINE: Poller[0] Host[445] ERROR: The POPEN timed out
03/25/2009 09:20:17 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:20:17 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:20:15 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:20:15 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:13 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:20:12 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:15:39 AM - SPINE: Poller[0] Host[399] ERROR: The POPEN timed out
03/25/2009 09:15:32 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:15:28 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:15:26 AM - SPINE: Poller[0] Host[389] ERROR: The POPEN timed out
03/25/2009 09:15:24 AM - SPINE: Poller[0] Host[388] ERROR: The POPEN timed out
03/25/2009 09:15:22 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:15:18 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:15:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:15:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:15:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:15:00 AM - SYSTEM STATS: Time:299.1729 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:7145
03/25/2009 09:15:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
03/25/2009 09:10:28 AM - SPINE: Poller[0] Host[399] ERROR: The POPEN timed out
03/25/2009 09:10:23 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:10:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:10:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:08 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:10:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:10:00 AM - SYSTEM STATS: Time:298.8048 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:6890
03/25/2009 09:10:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
03/25/2009 09:05:26 AM - SPINE: Poller[0] Host[564] ERROR: The POPEN timed out
03/25/2009 09:05:24 AM - SPINE: Poller[0] Host[447] ERROR: The POPEN timed out
03/25/2009 09:05:23 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:05:21 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:05:17 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:05:09 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:05:09 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:06 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:05:05 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread)
03/25/2009 09:03:53 AM - SYSTEM STATS: Time:230.9602 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:9324
The first two runs might have been skewed by not properly killing the spine / php poller process. When I killed it with the nifty-popen option enabled, I got a ton of errors, but it processed all RRDs and seemed to work properly. The run at 9:05 is the first change spine 0.8.7d and then run at 9:25 is the change to popen.

My 9:30 run looks very promising though:

Code: Select all

03/25/2009 09:31:39 AM - SYSTEM STATS: Time:97.2300 Method:spine Processes:8 Threads:20 Hosts:364 HostsPerProcess:46 DataSources:19010 RRDsProcessed:9295
03/25/2009 09:30:41 AM - SPINE: Poller[0] Host[84] ERROR: The POPEN timed out
03/25/2009 09:30:35 AM - SPINE: Poller[0] Host[399] ERROR: The POPEN timed out
03/25/2009 09:30:33 AM - AUTH LOGIN: User 'admin' Authenticated
03/25/2009 09:30:25 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:30:25 AM - SPINE: Poller[0] Host[205] ERROR: The POPEN timed out
03/25/2009 09:30:24 AM - SPINE: Poller[0] Host[390] ERROR: The POPEN timed out
03/25/2009 09:30:24 AM - SPINE: Poller[0] Host[307] ERROR: The POPEN timed out
03/25/2009 09:30:24 AM - SPINE: Poller[0] Host[373] ERROR: The POPEN timed out
03/25/2009 09:30:20 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:30:20 AM - SPINE: Poller[0] Host[508] ERROR: The POPEN timed out
03/25/2009 09:30:19 AM - SPINE: Poller[0] Host[5] ERROR: The POPEN timed out
03/25/2009 09:30:19 AM - SPINE: Poller[0] Host[474] ERROR: The POPEN timed out
03/25/2009 09:28:17 AM - SYSTEM MACTRACK STATS: Time:62.5497 ConcurrentProcesses:7 Devices:1
However, the 9:35 run seemed to have a lot more errors:

Code: Select all

03/25/2009 09:36:01 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:50 AM - SPINE: Poller[0] Host[84] ERROR: The POPEN timed out
03/25/2009 09:35:44 AM - SPINE: Poller[0] Host[559] ERROR: The POPEN timed out
03/25/2009 09:35:40 AM - SPINE: Poller[0] Host[84] ERROR: The POPEN timed out
03/25/2009 09:35:37 AM - SPINE: Poller[0] WARNING: SS[1] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:34 AM - SPINE: Poller[0] Host[559] ERROR: The POPEN timed out
03/25/2009 09:35:28 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:35:27 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:35:27 AM - SPINE: Poller[0] Host[33] ERROR: The POPEN timed out
03/25/2009 09:35:26 AM - SPINE: Poller[0] Host[313] ERROR: The POPEN timed out
03/25/2009 09:35:23 AM - SPINE: Poller[0] Host[427] ERROR: The POPEN timed out
03/25/2009 09:35:23 AM - SPINE: Poller[0] Host[419] ERROR: The POPEN timed out
03/25/2009 09:35:21 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:21 AM - SPINE: Poller[0] Host[508] ERROR: The POPEN timed out
03/25/2009 09:35:20 AM - SPINE: Poller[0] Host[475] ERROR: The POPEN timed out
03/25/2009 09:35:18 AM - SPINE: Poller[0] Host[532] ERROR: The POPEN timed out
03/25/2009 09:35:17 AM - SPINE: Poller[0] Host[530] ERROR: The POPEN timed out
03/25/2009 09:35:17 AM - SPINE: Poller[0] WARNING: SS[0] The PHP Script Server did not respond in time and will therefore be restarted
03/25/2009 09:35:17 AM - SPINE: Poller[0] Host[33] ERROR: The POPEN timed out
03/25/2009 09:35:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[19] FAILED: No SNMP Session
03/25/2009 09:35:10 AM - SPINE: Poller[0] WARNING: Host[322] DataQuery[1] FAILED: No SNMP Session
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:08 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:07 AM - PHPSVR: Poller[0] ERROR: Input Expected, Script Server Terminating
03/25/2009 09:35:07 AM - SPINE: Poller[0] FATAL: Spine Encountered a Segmentation Fault (Spine thread) 
-ryan
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Post by Howie »

I had given up on spine after it started to take 300 seconds to poll when I switched to a 64-bit CPU, apparently related to the (small) number of down devices at the time. It sounds like this release might help me with that issue... is that likely?
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Ryan (skinty),

I need some details on the hosts and scripts in question. My first observation is that for the popens, I need better loging. Beyond that, here are my questions:

SNMP Timeouts:
1) For SNMP Timeouts on Hosts 529 and 477, are these snmpv3 hosts?
2) For those same hosts, if you increase the timeout, does the problem go away?
3) If you reduce the MAX OID's does the problem go away?

For Re-Index Segmentation Faults:
1) Is HostID 322 an SNMP Host?
2) Is HostID 322 an SNMPv3 Host?
3) For HostID 322, what is the Re-Index Method for the Query?

For POPEN Timeouts:
1) For each host, are the scripts for those hosts notoriously slow?

Once you answer these, to eliminate the segmentation faults, attempt to find out what is causing the HostID 322 problem. It may be that it is an orphaned item in the poller_reindex table, or it could be that you have the reindex method set to something like Uptime Goes Backwards for a Non-SNMP host. Need those details. It's still a bug, but I need the fault domain to correct.

Regards,

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Howie wrote:I had given up on spine after it started to take 300 seconds to poll when I switched to a 64-bit CPU, apparently related to the (small) number of down devices at the time. It sounds like this release might help me with that issue... is that likely?
That was my hope. You can run in R/O mode for a while and simply log the messages. Cron this:

Code: Select all

*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1
If you have 1 minute polling, you may not want to do this though.

Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.
Last edited by TheWitness on Wed Mar 25, 2009 10:58 am, edited 1 time in total.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Post by Howie »

TheWitness wrote:That was my hope. You can run in R/O mode for a while and simply log the messages. Cron this:

Code: Select all

*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1
Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.
Ah, cool. I didn't realise this was possible. I'll try and get it going soon, time permitting of course :roll:
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Howie wrote:
TheWitness wrote:That was my hope. You can run in R/O mode for a while and simply log the messages. Cron this:

Code: Select all

*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1
Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.
Ah, cool. I didn't realise this was possible. I'll try and get it going soon, time permitting of course :roll:
Howie, make sure you change my bad syntax. Should be ">>" and not ">".

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
skinty
Posts: 41
Joined: Sat Aug 12, 2006 6:47 pm

Post by skinty »

For 529 with the popen fix switch
03/25/2009 03:20:36 PM - SPINE: Poller[0] Host[529] DEBUG: HOST COMPLETE: About to Exit Host Polling Thread Function
... EDITED OUT BY THEWITNESS ...
03/25/2009 03:20:36 PM - SPINE: Poller[0] Host[529] DS[12657] SCRIPT: /usr/bin/php -q /usr/share/cacti/scripts/cisco_cpu_usage.php 172.16.1.212, feedme, 2, , , 161, 2000 get fiveMin 1, output: 19
03/25/2009 03:20:32 PM - SPINE: Poller[0] Host[529] DEBUG: The POPEN returned the following File Descriptor 29
03/25/2009 03:20:32 PM - SPINE: Poller[0] Host[529] DS[12657] SCRIPT: /usr/bin/php -q /usr/share/cacti/scripts/cisco_cpu_usage.php 172.16.1.212, feedme, 2, , , 161, 2000 get oneMin 1, output: 18
My ping (using SNMP) timeout and snmp time out are 400 and 2000 respectively. That just sounds wrong when I type it, so if it is, I can change them. Everything is snmpv2 at this point, most of the re-index methods are done through script servers.

322 is a different beast that is only monitored using a tcp ping and graphed with advanced ping 1.3.

I'll provide more info shortly, but i've been in the weeds today.
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

322 is a different beast that is only monitored using a tcp ping and graphed with advanced ping 1.3.
Well that explains the segmentation fault's anyway. There are two orphaned poller_reindex entries for host_id 322 that need to be dealt with. I will make a fix in spine to ignore a host that has a reindex query set and does not use snmp. That will fix the segmentation faults, although there appears to be some database corruption.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Post by Howie »

TheWitness wrote:You can run in R/O mode for a while and simply log the messages. Cron this:

Code: Select all

*/5 * * * * /usr/local/spine/bin/spine -R >> /tmp/myspine.out 2>&1
Otherwise, this way it can run side-by-side with cmd.php and not cause interference with the main Cacti service.
How much output should I expect when running this way? I have logging set to HIGH, and I only get:

Code: Select all

No log handling enabled - turning on stderr logging
truncating unsigned value to 32 bits (2)
truncating unsigned value to 32 bits (2)
truncating unsigned value to 32 bits (2)
(20 or so lines of the same)
truncating unsigned value to 32 bits (2)
truncating unsigned value to 32 bits (2)
It runs very quickly (< 1 minute for sure), but it's not clear that anything is really happening.
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

Add "--stdout" to the options. Sorry.

Larry
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17047
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Post by TheWitness »

All,

To those monitoring. After todays feedback, I have made a few new changes to address the reindex issue as well as a slight alteration to the tcp_ping to designate a host as up if the TCP connection is refused by the host (aka It's up).

More feedback please. Still waiting on an SNMPv3 tester.

TheWitness
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Post by Howie »

Code: Select all

Time: 18.5671 s, Threads: 15, Hosts: 239
This is how it should be! I'll try dropping it into the live loop in a while, and see if it holds up there too..
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
User avatar
Howie
Cacti Guru User
Posts: 5508
Joined: Thu Sep 16, 2004 5:53 am
Location: United Kingdom
Contact:

Post by Howie »

Howie wrote:

Code: Select all

Time: 18.5671 s, Threads: 15, Hosts: 239
This is how it should be! I'll try dropping it into the live loop in a while, and see if it holds up there too..
Hmm. Nope :(

Code: Select all

03/26/2009 10:47:33 AM - SYSTEM STATS: Time:151.3474 Method:cmd.php Processes:4 Threads:N/A Hosts:239 HostsPerProcess:60 DataSources:13143 RRDsProcessed:4936
03/26/2009 10:55:03 AM - SYSTEM STATS: Time:301.4584 Method:spine Processes:4 Threads:15 Hosts:239 HostsPerProcess:60 DataSources:13143 RRDsProcessed:2711
Am I reading that correctly as being an issue with the rrd updates rather than data collection? I thought the rrd-update process was the same for both cmd.php and spine though...
Weathermap 0.98a is out! & QuickTree 1.0. Superlinks is over there now (and built-in to Cacti 1.x).
Some Other Cacti tweaks, including strip-graphs, icons and snmp/netflow stuff.
(Let me know if you have UK DevOps or Network Ops opportunities, too!)
skinty
Posts: 41
Joined: Sat Aug 12, 2006 6:47 pm

Post by skinty »

Howie,

I was getting some diffs between the two as well. You're looking at almost 50% loss in RRD's being processed, i was getting about a 20% loss.

TheWitness,
That will fix the segmentation faults, although there appears to be some database corruption.
What do you suggest for checking into possible database corruption?

-ryan
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests