| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 24
|
|
| Author |
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
sudo ls -lRa ~boinc | grep -v boinc | sort -u gives: drwxr-xr-x 29 root root 4096 Jul 25 10:31 .. total 12 total 1548 total 20 total 24 total 288096 total 8 sudo ls -lRa ~boinc | grep -v ^.rw | grep -v :$ | sort -u gives: total 12 total 1548 total 20 total 24 total 288096 total 8 That's looking good. I don't have any idea at the moment what's going wrong. What about: ps -fuboincand id boincand ls -ldn ~boinc I remember sometimes some directories could be missing. Could you check that at least these directories are present with cd ~boinc && sudo find . -type dyielding:
./notices ./slots ./projects ./projects/www.worldcommunitygrid.org [Edit 1 times, last edit by adriverhoef at Jul 25, 2019 4:51:43 PM] |
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
@daka, THANK YOU. That worked perfectly. That's so weird that a production Linux kernel like 4.19.x that ships with Debian stable isn't compatible with BOINC 7.14.2, which has been out since 2018. But editing the GRUB file with the "emulate" attribute, re-generating GRUB, and re-booting works marvelously.
----------------------------------------I'm going to tar/copy the /var/lib/boinc-client directory at my convenience to a USB flash drive, and I should be good for a "clean" install of Debian 10 now for OCD reasons. I could even roll the dice and see if WCG's device matching process will recognize the device now and skip restoring from backup. *crosses fingers* Ultimately, what I'd like to see is WCG to get their back-end act together and allow users the ability to merge Device IDs and delete Device IDs with zero results so we can clean up our profiles. I'd like to see a push towards vanilla BOINC and retiring all the legacy custom stuff on the back-end.
[Edit 1 times, last edit by hchc at Jul 25, 2019 5:03:15 PM] |
||
|
|
daka
Advanced Cruncher Sweden Joined: Apr 4, 2017 Post Count: 92 Status: Offline Project Badges:
|
@daka, THANK YOU. That worked perfectly. That's so weird that a production Linux kernel like 4.19.x that ships with Debian stable isn't compatible with BOINC 7.14.2, which has been out since 2018. But editing the GRUB file with the "emulate" attribute, re-generating GRUB, and re-booting works marvelously. Great! The problem isn't with BOINC, but with the projects individual applications. To maximize compatibility they compile them with really really old glibc versions. Some users run really old installations that wouldn't work with newer versions. This usually isn't a problem, but now the default has changed to not allow a feature (vsyscalls) the older glibc (< 2.14) uses. Hopefully WCG can switch to target something newer, or give us an option to get a different set of project binaries when running recent kernels.
2 x i5-5300U @ 2.7GHz (2 x 2c4t, 40W power usage total)
46 x i5-5200U @ 2.5GHz (46 x 2c4t, 920W power usage total) Not running currently: 96 x i5-7200U @ 3.1GHz (96 x 2c4t, 2400W power usage total) |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm running Ubuntu 19.04 (debian based) with 5.0 kernel and never had to change any kernel parameters. All applications work just fine for me and always have through several upgrades going back to 12.xx. Perhaps the upgrade in place left some mis-matches laying around. Might be worth doing the clean install and then trying it again without the parameter and see if the problem recurs. I have never run Debian so maybe that's one of the differences between Debian and Ubuntu.
|
||
|
|
daka
Advanced Cruncher Sweden Joined: Apr 4, 2017 Post Count: 92 Status: Offline Project Badges:
|
I'm running Ubuntu 19.04 (debian based) with 5.0 kernel and never had to change any kernel parameters. All applications work just fine for me and always have through several upgrades going back to 12.xx. Perhaps the upgrade in place left some mis-matches laying around. Might be worth doing the clean install and then trying it again without the parameter and see if the problem recurs. I have never run Debian so maybe that's one of the differences between Debian and Ubuntu. The difference is Ubuntu compiles their kernel with CONFIG_LEGACY_VSYSCALL_EMULATE=y. Debian uses CONFIG_LEGACY_VSYSCALL_NONE=y. Vsyscalls are deprecated because of security problems and more and more Linux distributions disable them by default. It's not that they in themselves are vulnerable, but if something else on your computer has a vulnerability they can help make it a lot easier to exploit it.
2 x i5-5300U @ 2.7GHz (2 x 2c4t, 40W power usage total)
46 x i5-5200U @ 2.5GHz (46 x 2c4t, 920W power usage total) Not running currently: 96 x i5-7200U @ 3.1GHz (96 x 2c4t, 2400W power usage total) |
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
Update.
----------------------------------------One month later I finally pulled the trigger on a clean install of Debian 10 (Buster). Before the install, I made sure that the OS was fully updated so that the OS string would be identical when first communicating with WCG servers. After a couple hours of reinstalling to the same 16GB USB flash drive and using the same host name and making sure it pulled the same IP address from my router, I attached to WCG and crossed my fingers. It worked and recognized the device as the same, and it kept the same Device ID/HostID. (Unfortunately, I didn't have the same luck when doing a reinstall on one of my Win10 boxes, so there's a duplicate that needs to be merged.) Thanks for the help everyone! Still remains the long feature request to WCG admins to allow users the ability to merge multiple Device IDs together. As well as delete Device IDs with zero results. ![]()
[Edit 4 times, last edit by hchc at Aug 25, 2019 7:31:14 AM] |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
Just for a little additional info, here is my experience. I had a Dell T7400 with dual Xeon E5410's running for a long time. It was running Linux Mint 16 I think. Well, the power supply went flaky and finally died. I took the hard drive out of the T7400 and installed it in an old Core2Duo system. It took off with no problems. I was able to crunch most of the remaining work units in the queue ( a few were system aborted due to being too old and someone else got them.) The point is WCG recognized the machine as the T7400. It did not register it as a new installation, even though the actual machine had been changed. When the Core2Duo is finished, I will shut it down and look for another 8 core or better machine to put the hard drive in. Based on this behavior, I am confident whatever new system I place that hard drive in will continue to be recognized as the original T7400.
----------------------------------------This makes me wonder, if I clone the drive will I be able to have 2 or more machines all be recognized as the original T7400 ? I may try this at some point if and when I get replacement machines. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
katoda
Senior Cruncher Poland Joined: Apr 28, 2007 Post Count: 172 Status: Offline Project Badges:
|
Surely you will not be able to have two T7400. If you do a clone, then one of them (I think the one which will be launched later) will be registered as a new machine. Keyword: rpc_seqno in client_state.xml. If you launch a device where this number is smaller than registered on the server side, the device will be rejected and registered as a new one, with new hostid.
----------------------------------------BTW, I'm not suprised that after moving the HDD to a new machine all the work was kept and no new device was registered. You kept all the BOINC configuration data, no change in rpc_seqno was done, so BOINC accepted it as an old-new machine, just with a different name and hardware configuration. I did that many times with 100% success rate, moving between devices with different processors, core number and Windows version, even once I moved a Windows device to Linux one - works until today :) ![]() [Edit 2 times, last edit by katoda at Aug 25, 2019 9:20:38 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Dispelling myths in this thread - a SIGSEGV (segmentation fault) has nothing to do with the kernel in this case, most likely what happened is OP did not do the Debian dist-upgrade properly, leaving a mixture of bad libraries the BOINC client is using and the client itself. There is nothing incompatible with the Debian 10 (or any) kernel in this regard.
----------------------------------------It's complicated, but in general when an application is compiled against shared libraries, it expects the entry point to functions (et. al) in those libraries to reside in specific locations (memory wise, when loaded) - when two the get out of alignment you get the binary attempting to access a function out of a library at the wrong place - so it's a memory violation, aka segmentation fault. (this is a gross over-simplification) This crash was telling us that some parts of BOINC were properly upgraded, but other parts were not - so the application was crashing with a memory violation. The solution would have been to do a quick package examination on the device (a little bit of apt work, not hard) and resolve the parts of the puzzle which did not upgrade correctly. The packages are tagged with which release they were meant for (stretch, buster, etc.) so you just sort of look for "stretch" packages which never properly upgrade to "buster" ones more or less. (edit: spelling) [Edit 1 times, last edit by xithryx at Aug 25, 2019 3:22:57 PM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1322 Status: Offline Project Badges:
|
Dispelling myths in this thread - a SIGSEGV (segmentation fault) has nothing to do with the kernel in this case, most likely what happened is OP did not do the Debian dist-upgrade properly, leaving a mixture of bad libraries the BOINC client is using and the client itself. There is nothing incompatible with the Debian 10 (or any) kernel in this regard. Actually, it had everything to do with the kernel (or, more accurately, with the specific kernel build), and the user fixed the issue with a change of boot-time parameter. Different distributions have taken different approaches to vsyscall, the cause of the SIGSEGVs in applications. To save re-iterating the explanation, I'll quote from the post by daka on Jul 26, 2019 2:42:57 AM : The difference is Ubuntu compiles their kernel with CONFIG_LEGACY_VSYSCALL_EMULATE=y. Debian uses CONFIG_LEGACY_VSYSCALL_NONE=y. Vsyscalls are deprecated because of security problems and more and more Linux distributions disable them by default. It's not that they in themselves are vulnerable, but if something else on your computer has a vulnerability they can help make it a lot easier to exploit it. And as for shared libraries: It's complicated, but in general when an application is compiled against shared libraries, it expects the entry point to functions (et. al) in those libraries to reside in specific locations (memory wise, when loaded) - when two the get out of alignment you get the binary attempting to access a function out of a library at the wrong place - so it's a memory violation, aka segmentation fault. (this is a gross over-simplification) WCG BOINC non-graphics applications are built with statically linked libraries - do an ldd on one to see. As far as I am aware, only the graphics applications use shared libraries, and it wasn't a graphics application that was failing... Lots of people using Debian Buster ran up against this, not just here but with other applications as well. (And if a shared library application was also using vsyscalls, it too would probably fail SIGSEGV.) For what it's worth, I made my Ubuntu do this by turning off vsyscall emulation on one of my machines when this first cropped up, to see what would happen; one or two of my non BOINC-related applications also crashed when I did so!!! So I don't think there were any myths to dispel! Of course, if I have misunderstood your post, my apologies. Al. [Edited to alter the sequence of quotes...] [Edit 1 times, last edit by alanb1951 at Aug 25, 2019 6:36:48 PM] |
||
|
|
|