Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Moderators: Developers, Moderators
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
After doing that, pull a fresh copy of spine. I did find one unrelated bug, but also did some re-organizing as I'm trying to isolate the root cause, still assuming it's spine.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Alrighty. There's no noticeable difference in running with snmp_host_cleanup commented out. I just pulled the latest spine and have that installed now.
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Okay, thanks for that. I don't see any leaks on a valgrind (tool to test for leaks). Do we know what host it's crashing on or is it totally random?
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
It's 8677
See if it crashes just on that.
Code: Select all
./spine -V 6 -S --mibs -f 8677 -l 8677
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
As far as I can tell it's totally random. The number of RRDs processed is different anytime so to me that makes it sound like it's stopping at different spots.
I ran that command... it runs just fine.
Wait, here's something. I ran this command: /usr/local/spine/bin/spine -R -S -V 3
Found that from a how-to on installing spine. https://medium.com/make-it-easy/how-to- ... bc4c75c502
Anyway, I've run that every time I update spine and it's always fine. I did it again for the heck of it and it kicked out an error.
--------------------
FATAL: Spine Encountered a Segmentation Fault
Generating backtrace...0 line(s)...
Total[33.2246] Device[8692] SNMP Ping Unknown Error
root@oss:/usr/local/spine/bin# Unable to flush stdout: Broken pipe
Unable to flush stdout: Broken pipe
Unable to flush stdout: Broken pipe
--------------------
I grepped for ping in the cacti.log and found this:
2022/06/21 20:10:21 - SPINE: Poller[1] PID[3290061] PT[140041938515712] WARNING: Invalid Response, Device[2450] HT[1] DS[37672] SCRIPT: perl /usr/share/cacti/site/scripts/ping.pl '10.10.10.10', output: U
Shoot, are we looking for the wrong thing? That ping error repeats a few times in cacti.log and it's present each time spine crashes. BUT, it's also present during polling cycles where spine doesn't crash. It also happens with various device IDs
I ran that command... it runs just fine.
Wait, here's something. I ran this command: /usr/local/spine/bin/spine -R -S -V 3
Found that from a how-to on installing spine. https://medium.com/make-it-easy/how-to- ... bc4c75c502
Anyway, I've run that every time I update spine and it's always fine. I did it again for the heck of it and it kicked out an error.
--------------------
FATAL: Spine Encountered a Segmentation Fault
Generating backtrace...0 line(s)...
Total[33.2246] Device[8692] SNMP Ping Unknown Error
root@oss:/usr/local/spine/bin# Unable to flush stdout: Broken pipe
Unable to flush stdout: Broken pipe
Unable to flush stdout: Broken pipe
--------------------
I grepped for ping in the cacti.log and found this:
2022/06/21 20:10:21 - SPINE: Poller[1] PID[3290061] PT[140041938515712] WARNING: Invalid Response, Device[2450] HT[1] DS[37672] SCRIPT: perl /usr/share/cacti/site/scripts/ping.pl '10.10.10.10', output: U
Shoot, are we looking for the wrong thing? That ping error repeats a few times in cacti.log and it's present each time spine crashes. BUT, it's also present during polling cycles where spine doesn't crash. It also happens with various device IDs
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Hmm that may be a red herring. I've run it a few more times and this was the latest ending:
Total[41.7185] Device[3364] HT[1] DS[38065] TT[21.20] SCRIPT: perl /usr/share/cacti/site/scripts/ping.pl 'x.x.x.x', output: 14.1
Total[41.7197] Device[1122] Checking for System Information Update
Total[41.7197] Device[1122] Updating Full System Information Table
Total[41.7215] Device[911] Checking for System Information Update
Total[41.7215] Device[911] Updating Full System Information Table
Total[41.7229] Device[3786] Checking for System Information Update
Total[41.7229] Device[3786] Updating Full System Information Table
double free or corruption (fasttop)
FATAL: Spine Interrupted by Abort Signal
Aborted (core dumped)
Total[41.7185] Device[3364] HT[1] DS[38065] TT[21.20] SCRIPT: perl /usr/share/cacti/site/scripts/ping.pl 'x.x.x.x', output: 14.1
Total[41.7197] Device[1122] Checking for System Information Update
Total[41.7197] Device[1122] Updating Full System Information Table
Total[41.7215] Device[911] Checking for System Information Update
Total[41.7215] Device[911] Updating Full System Information Table
Total[41.7229] Device[3786] Checking for System Information Update
Total[41.7229] Device[3786] Updating Full System Information Table
double free or corruption (fasttop)
FATAL: Spine Interrupted by Abort Signal
Aborted (core dumped)
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Notable. I just made another update. Download and test again.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
It didn't like that. Weird though - the test runs go fine. But as soon as the poller runs, it fails. I had to back out of that one because I wasn't getting any good poll cycles.
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
I've gotta get offline for now. Tomorrow I might mass disable a whole bunch of hosts and start adding back to see if it's a problem with something. Actually that brings up a question. Does the poller process devices in any certain order? I think I've noticed that devices added a long time ago always seem to be processed and those graphs are fine. It's newer stuff that has the gaps in the graphs when the poller fails to complete.
Edit: By older devices, I mean ones added years ago. It's a not a difference of just devices added since the update to 1.2.21. It's like things added years ago are fine but say, maybe, devices added in the last 6 months are affected by the crashes. I haven't really taken time to narrow that down yet.
Edit: By older devices, I mean ones added years ago. It's a not a difference of just devices added since the update to 1.2.21. It's like things added years ago are fine but say, maybe, devices added in the last 6 months are affected by the crashes. I haven't really taken time to narrow that down yet.
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
I'm cautiously optimistic that I've found one single host causing the problem. I found this thread: viewtopic.php?t=53496
The last comment talks about setting collection threads to 1. I did a query on the database and found 40 hosts that were set to 5 threads. 39 of them had been added since the upgrade (I guess that's a new default?) and 1 of them had been added years ago. I went to that host and changed the number of threads to 1.
Spine hasn't crashed for 4 hours. Previously the longest run of good polls was about an hour, maybe 90 minutes at most. I'm not calling it a win yet - I want to see it go all day or maybe 24 hours.
I also went into settings and changed device defaults from 5 threads to 1 so that we don't get any more 5's in there.
The last comment talks about setting collection threads to 1. I did a query on the database and found 40 hosts that were set to 5 threads. 39 of them had been added since the upgrade (I guess that's a new default?) and 1 of them had been added years ago. I went to that host and changed the number of threads to 1.
Spine hasn't crashed for 4 hours. Previously the longest run of good polls was about an hour, maybe 90 minutes at most. I'm not calling it a win yet - I want to see it go all day or maybe 24 hours.
I also went into settings and changed device defaults from 5 threads to 1 so that we don't get any more 5's in there.
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
What is the device type? What type of metrics? How many OIDS for the device?
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Goto device defaults under setting to find the default.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
It's a Miktrotik router, HapLite, Board name RB941-2nD. We've got 2000+ of these being polled. After the change, this host looks identical to all the rest. The only thing I changed was the number of collection threads earlier today.
All we're monitoring is traffic rate on port 1.
SNMPv3, SHA, AES-128, port 161, timeout 500ms, Max OIDs per request is 5.
There's 1 single graph
Associated Data Queries: SNMP - Interface Statistics
Graph created for port 1 with In/Out Bits with 95th Percentile template
There are 2 OIDs polled: 1 for traffic in, 1 for traffic out. I put it in debug mode for one cycle. Output below:
All we're monitoring is traffic rate on port 1.
SNMPv3, SHA, AES-128, port 161, timeout 500ms, Max OIDs per request is 5.
There's 1 single graph
Associated Data Queries: SNMP - Interface Statistics
Graph created for port 1 with In/Out Bits with 95th Percentile template
There are 2 OIDs polled: 1 for traffic in, 1 for traffic out. I put it in debug mode for one cycle. Output below:
Code: Select all
2022/06/22 12:45:02 - SPINE: Poller[1] PID[3764037] PT[139726619141888] DEBUG: Device[5355] HT[1] In Poller, About to Start Polling
2022/06/22 12:45:02 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] DEBUG: Entering SNMP Ping
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] Updating Full System Information Table
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] DEBUG: Device[5355] HT[1] RECACHE: Processing 1 items in the auto reindex cache for 'x.x.x.x'
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] HT[1] DQ[1] Extended Uptime Result: U, Is Numeric: 0
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] HT[1] DQ[1] RECACHE OID: .1.3.6.1.2.1.1.3.0, (assert: 157945200 < output: 157974700)
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] HT[1] NOTE: There are '2' Polling Items for this Device
2022/06/22 12:45:03 - PCOMMAND Device[1356] NOTE: Recache Event Detected for Device
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] HT[1] DS[32764] TT[115.14] SNMP: v3: x.x.x.x, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.2, value: 2516712657
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] HT[1] DS[32764] TT[115.22] SNMP: v3: x.x.x.x, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.2, value: 3661114414
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] Device[5355] HT[1] Total Time: 1.2 Seconds
2022/06/22 12:45:03 - SPINE: Poller[1] PID[3764037] PT[139726619141888] DEBUG: Device[5355] HT[1] DEBUG: HOST COMPLETE: About to Exit Device Polling Thread Function
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
Actually I mis-spoke. I didn't just change that one, I changed all that have been added. I thought I remembered my query only hit that one device but I changed all 40 rows.
The statements in my previous post though don't change. The only devices that have been added are all Mikrotiks all added the same way with the same graph created for port #1. There have also been a handful of snmp1 devices added. But from what I'm understanding, the problem is focused on snmpv3.
Code: Select all
update host set device_threads = '1' where device_threads != '1';
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Re: Upgrade 1.2.18 to 1.2.21 - Error: Maximum runtime of 298 seconds exceeded
I see the clue right there with the Extended Uptime OID failing. I'll be making another spine update after work tonight on that front.
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Who is online
Users browsing this forum: No registered users and 1 guest