(SOLVED) Spine 1.2.21 segfaults heavily on large load

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

stefanbrudny
Cacti User
Posts: 130
Joined: Thu Jan 19, 2012 11:52 am

(SOLVED) Spine 1.2.21 segfaults heavily on large load

Post by stefanbrudny »

This topic was about to be discussion about feature requests for Automate, but I decided to describe use case. Its been 5 years and spine still fails. I thought that maybe, if scalability matters, someone gets interested. This is really root cause, and with this running fine many more feature requests could be postponed when only mid scale is demanded or dropped entirely. No new development, no pain; fixing and making stuff as it was intended makes the stuff better than by adding new features.

Original topic, which I enjoyed a lot, as it teached me a something:
viewtopic.php?t=54036&start=30

Current status:

Code: Select all

root@cacti-2022-loaded:/var/www/html/cli# time /usr/bin/spine -C /var/www/html/spine.conf --poller 1 --first 878 --last 4327 --mibs
SPINE: Using spine config file [/var/www/html/spine.conf]
Version 1.2.21 starting
2022-06-14 00:01:45 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[2057] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 00:01:46 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[1886] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 00:01:52 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[2396] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 00:01:57 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[1285] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 00:02:06 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[2327] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 00:02:09 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[3360] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 00:02:10 - SPINE: Poller[1] PID[2086310] PT[139845542202240] ERROR: Device[4096] HT[2] polling timed out while acquiring Available Thread Lock
FATAL: Spine Encountered a Segmentation Fault
Generating backtrace...0 line(s)...

real    1m41.842s
user    0m2.511s
sys     0m3.193s
root@cacti-2022-loaded:/var/www/html/cli#
What igredients add to the pot to make this Spine-killing, ukhm, nectar:
* use 5 minutes polling (minute is even easier)
* add approximately 3k hosts or more
* make sure they are slow in response, such as like response of first 250 devices is on average of 10k miliseconds
* make sure there are 250 of hosts down

Results are, uhm, well, catastrophic:
* user cannot say how many devices haven't been graphed
* user cannot alter many devices in bulk (as there is no bulk management at scale, cannot eg. reduce timeout and skip the devices above 3k milliseconds)
* there is no simple exit from this situation other than disable the number of hosts down, and even this does not always help

Other remarks and observations:
* configuration in the number of spine threads doesn't really matter, at scale of 3k devices Cacti system gets unusable.
* in comparison, I maxed out Cacti server resources by emulating approx 30k devices using old empty server with 2x Xeons. Cacti spine sucks approx 25k * 256 interfaces, in out bits, every 5 minutes, EASILY (well, NVMe disk burns at 30k IOPS at boost times). So I can easily get 20x times more only because everything this is very artificial, DC oriented, and no surprising delays are introduced. I think I could try more, I just need to split the snmpd to several more servers, which should not be a big deal. However, thats not the point, point is real life use case testing and I am to leave if for another winter evening.

How to workaround this issue:
* disable all hosts which are down, mostly helps spine to complete its cycle. Of course is one way, as enabling some devices back would kill the installation
* sometimes shortening the SNMP timeout helps (I need SNMP only, not sure about UDP or ICMP - it is blocked mostly for myself). But that doesn't help, I really need my lets say, 5 seconds lasting polling.

My ideas of how to resolve this in orders of preferrence:
* introduce automated hosts disabling based on their status. I'd say I need an expression: If the hosts is down for more than 5 poller cycles, disable the host with a DESC
** but I also need a rule: enable a host if its disabled with a DESC for longer amount of period than 10 poller cycles.
* introduce separate poller for checking up the hosts before they are polled. This could be essentially... almost separate spine process with different configuration. Such process could mark the hosts for polling / not polling so the main poller knows number of hosts to skip yet before the run. This should be an toggable option in Cacti configuration. Option name: "Relaxed down host processing" or similar.
* introduce overall limits and control of the maximum number of processes and threads spawned for a installation (if its not now by number of processes * number of threads)
* increase possible number of threads for spine to >100 and try to overcome (how would that work, side effects, no idea)

This is why I say Cacti is not able to go at small scale. Prove me wrong.
Last edited by stefanbrudny on Sat Jun 25, 2022 5:26 pm, edited 1 time in total.
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Do the same thing with -V 3 -S -R and post the results. BTW, the thread lock is 25 seconds and was introduced to avoid people that run scripts that don't time out on their own. How many threads was that BTW?
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

If you have problems with your WAN that would cause unreliable responses, you should use Remote Data Collectors. I know of Cacti install's with over 40k devices that collect data from over 1M Data Sources in about 120 seconds. That's pretty good. Data collection is N-Tiered like gmond thought. So all responses are local regardless of where the devices are. This is the same a what you should expect with Remote Data Collectors.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Not sure where the segfault is coming from though. I guess I should setup the same environment and see what's happening. If you run at -V 5, then there will be way more output to track the location of the segfault. If you can capture a core file, run a backtrace against it and report back.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Hmm, I setup a real skanky environment and the outcome was bad. It did not segfault, but also did not time out. Almost worse.... Try using spine 1.2.5 and let me know if it works better. Have to work that out, but before you try 1.2.5, get me that -V 5 -S -R output.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Pull the latest spine 1.2.x branch and test again after I see the 1.2.5 output.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

You should get something like this now:

Code: Select all

[root@vmhost5 bin]# ./spine -R -S
SPINE: Using spine config file [../etc/spine.conf]
Version 1.2.21 starting
Total[59.0142] ERROR: Device[2292] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.1051] ERROR: Device[2408] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.1960] ERROR: Device[2410] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.2868] ERROR: Device[3502] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.3777] ERROR: Device[3506] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.4685] ERROR: Device[3507] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.5593] ERROR: Device[3513] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.6501] ERROR: Device[3514] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.7410] ERROR: Device[3515] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.8319] ERROR: Device[3518] HT[1] polling timed out while acquiring Available Thread Lock
Total[59.9228] ERROR: Device[3520] HT[1] polling timed out while acquiring Available Thread Lock
Total[60.0035] ERROR: Device[3521] HT[1] Spine Timed Out While Processing Devices External
Total[60.0036] ERROR: Device[3521] polling timed out while waiting for 20 Threads to End
Total[60.0037] WARNING: There were 4726 threads which did not run
Time: 60.0541 s, Threads: 20, Devices: 4748
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
stefanbrudny
Cacti User
Posts: 130
Joined: Thu Jan 19, 2012 11:52 am

Re: 1.2.21 segfaults heavily on large load

Post by stefanbrudny »

Thanks, catching up.

* 1.2.5 segfaults in 12 seconds:

Code: Select all

time /usr/bin/spine -C /var/www/html/spine.conf --poller 1 --first 621 --last 7309 --mibs  -V 5 -S -R
[....]

Updating Full System Information Table
Device[1709] SNMP Result: Device responded to SNMP
Updating Full System Information Table
FATAL: Spine Encountered a Segmentation Fault [95, Operation not supported] (Spine thread)
I know this is enough, I'd need debug this, but I'll come to this later.

* I was working with various combinations to try all number of threads, starting from 32 to 100.
* compiled 1.2.x branch with latest correction and I can confirm the behaviour now is little better and I can poll through my devices (6438), of course, when devices are down, results are poor:

Code: Select all

root@cacti-2022-loaded:/home/spine-1.2.x# time /usr/bin/spine -C /var/www/html/spine.conf --poller 1 --first 621 --last 7309 --mibs
SPINE: Using spine config file [/var/www/html/spine.conf]
Version 1.2.21 starting
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[1471] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[1471] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3956] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3956] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3957] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3957] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[1388] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[1784] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[1784] HT[2] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:48 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3103] HT[1] polling timed out while acquiring Available Thread Lock
2022-06-14 23:26:49 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3103] HT[2] Spine Timed Out While Processing Devices External
2022-06-14 23:26:49 - SPINE: Poller[1] PID[2748080] PT[140690957810560] ERROR: Device[3103] polling timed out while waiting for 64 Threads to End
2022-06-14 23:26:49 - SPINE: Poller[1] PID[2748080] PT[140690957810560] WARNING: There were 6386 threads which did not run
Time: 60.0115 s, Threads: 64, Devices: 6437

real    1m0.022s
user    0m0.986s
sys     0m1.724s
root@cacti-2022-loaded:/home/spine-1.2.x#
So lets run 5 minutes poller (spoiler alert: segfaults):

Code: Select all

root@cacti-2022-loaded:/home/spine-1.2.x# time /usr/bin/spine -C /var/www/html/spine.conf --poller 1 --first 621 --last 7309 --mibs
SPINE: Using spine config file [/var/www/html/spine.conf]
Version 1.2.21 starting
2022-06-14 23:30:13 - SPINE: Poller[1] PID[2749533] PT[140167834744576] Device[1471] HT[2] DQ[23] RECACHE ASSERT FAILED: '42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
6=42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
69 67 45 20 28 4E 44 49 53 20 56 42 44 20 43 6C
69 65 6E 74 29 00'2022-06-14 23:30:28 - SPINE: Poller[1] PID[2749533] PT[140167826351872] Device[1471] HT[1] DQ[23] RECACHE ASSERT FAILED: '42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
6=42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
69 67 45 20 28 4E 44 49 53 20 56 42 44 20 43 6C
[ that continues for +~20 devices ]

2022-06-14 23:31:55 - SPINE: Poller[1] PID[2749533] PT[140167356589824] Device[2148] Hostname[--------] ERROR: HOST EVENT: Device is DOWN Message: Device did not respond to SNMP
2022-06-14 23:31:56 - SPINE: Poller[1] PID[2749533] PT[140168992380672] Device[2597] Hostname[--------] ERROR: HOST EVENT: Device is DOWN Message: Device did not respond to SNMP
FATAL: Spine Encountered a Segmentation Fault
Generating backtrace...0 line(s)...

real    2m46.628s
user    0m2.823s
sys     0m3.671s
Look, this is concidencally same group of devices as it was 5 years ago. Might be just its difficult for Spine and I may be wasting your time. When going into synthetic mode, and if I ever go to prod with cacti, I'll have max 20-60 device types only and all is going to be easier.

I have following proposals:
* I can pass the creds to cacti & shell with root, I need to clean a little, make a copy of the container etc.
* If you could reach you in private I may write more details - no time waste, real stuff. Mixed technical - business, options, real use case reveal.
stefanbrudny
Cacti User
Posts: 130
Joined: Thu Jan 19, 2012 11:52 am

Re: 1.2.21 segfaults heavily on large load

Post by stefanbrudny »

Really, I see is no rule without debugging spine itself:

Code: Select all

root@cacti-2022-loaded:/home/spine-1.2.x# rm /usr/bin/spine && ln -s /opt/spine-1.2.21+/bin/spine /usr/bin/spine && spine -v
SPINE 1.2.21  Copyright 2004-2021 by The Cacti Group
root@cacti-2022-loaded:/home/spine-1.2.x# vim /etc/crontab ^C
root@cacti-2022-loaded:/home/spine-1.2.x# time /usr/bin/spine -C /var/www/html/spine.conf --poller 1 --first 621 --last 7309 --mibs
SPINE: Using spine config file [/var/www/html/spine.conf]
Version 1.2.21 starting
2022-06-15 00:09:25 - SPINE: Poller[1] PID[2764035] PT[139832231708416] Device[1448] HT[2] DQ[23] RECACHE ASSERT FAILED: '\"ethernetCsmacd\"=ethernetCsmacd'
2022-06-15 00:09:33 - SPINE: Poller[1] PID[2764035] PT[139832491751168] Device[1471] HT[2] DQ[23] RECACHE ASSERT FAILED: '42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
6=42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
69 67 45 20 28 4E 44 49 53 20 56 42 44 20 43 6C
69 65 6E 74 29 00'2022-06-15 00:09:34 - SPINE: Poller[1] PID[2764035] PT[139832089097984] Device[1137] Hostname[-----] NOTICE: HOST EVENT: Device Returned from DOWN State
FATAL: Spine Encountered a Segmentation Fault
Generating backtrace...0 line(s)...

real    0m13.304s
user    0m0.155s
sys     0m0.155s
root@cacti-2022-loaded:/home/spine-1.2.x# time /usr/bin/spine -C /var/www/html/spine.conf --poller 1 --first 621 --last 7309 --mibs
SPINE: Using spine config file [/var/www/html/spine.conf]
Version 1.2.21 starting
2022-06-15 00:09:56 - SPINE: Poller[1] PID[2764403] PT[140157416085248] Device[1448] HT[2] DQ[23] RECACHE ASSERT FAILED: '\"ethernetCsmacd\"=ethernetCsmacd'
2022-06-15 00:10:02 - SPINE: Poller[1] PID[2764403] PT[140157944682240] Device[1471] HT[1] DQ[23] RECACHE ASSERT FAILED: '42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
6=42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
69 67 45 20 28 4E 44 49 53 20 56 42 44 20 43 6C
69 65 6E 74 29 00'2022-06-15 00:10:04 - SPINE: Poller[1] PID[2764403] PT[140157466441472] Device[1471] HT[2] DQ[23] RECACHE ASSERT FAILED: '42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
6=42 72 6F 61 64 63 6F 6D 20 42 43 4D 35 37 30 38
43 20 4E 65 74 58 74 72 65 6D 65 20 49 49 20 47
69 67 45 20 28 4E 44 49 53 20 56 42 44 20 43 6C
FATAL: Spine Encountered a Segmentation Fault
Generating backtrace...0 line(s)...
69 65 6E 74 29 00'
real    0m13.035s
user    0m0.105s
sys     0m0.223s
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Code: Select all

gdb ./spine
run -S -V 5
bt full
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Interesting reindex sort field. Totally wrong.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
stefanbrudny
Cacti User
Posts: 130
Joined: Thu Jan 19, 2012 11:52 am

Re: 1.2.21 segfaults heavily on large load

Post by stefanbrudny »

bt sent in pm.

Code: Select all

Interesting reindex sort field. Totally wrong.
Hope that nails is. I have limited or no control over the devices, I won't be able to change it. When at scale with 15-25 device models and 60+ firmware versions I need such items / devices to be marked / ignored / etc., but I cannot completely avoid this.
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

I'm wondering why you are not using ifName. Maybe an old interface.xml file. Switching can be deadly. It'll switch, but takes some time to fully reindex.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17061
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: 1.2.21 segfaults heavily on large load

Post by TheWitness »

Got the bt. It's some new code, likely a buffer issue, have to review.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests