| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 567
|
|
| Author |
|
|
TLD
Veteran Cruncher USA Joined: Jul 22, 2005 Post Count: 856 Status: Offline Project Badges:
|
My guess is it has "something" to do with WSL, take a look at this workunit: https://www.worldcommunitygrid.org/contributi...071_9340,2,-Result%20name The one on the top appears to be due to an installation error. I have had no MCM errors or invalids after installing it over the weekend. However, since the validator appears to match OS with OS (all the ones I've looked at have Alpine Linux matches), I could see this slowing things down due to it being less common than simply Windows. Running the TOP command, I don't see MCM actually running on WSL, which I wouldn't expect it to. At this point, I'm leaning to uninstalling the feature. I run Ubuntu on WSL2 on a windows machine. I run a couple of projects that only have Linux WUs. WCG MCM WUs run fine on the system. ![]() |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1316 Status: Offline Project Badges:
|
Regarding the possible effect of incorrect O/S information on validation and work availability...
In summary, bad O/S identification shouldn't affect validation, but will be annoying when looking at results on the web site (or via an API) and can be a real nuisance to the scheduler (as we may have been seeing recently...) The work availability issue has already been discussed here, but it may be worth a [partial] refresher on how Homogeneous Redundancy (HR) influences task selection, using standard BOINC code as a reference point -- I can't remember when that was last discussed in detail... If HR is used for a project (true for WCG in all cases), the scheduler checks for various platform matching functions using O/S and CPU information (the platform information in the scheduler request isn't always suitable for that as it simply specifies an executable file type but doesn't identify critical [Windows] O/S subset cases...) -- there is a core set of tests done against O/S information (mostly for Windows) and against CPU types (as per host_info held or supplied). However, as far as I can tell it always uses the platform information (e.g. x86_64-pc-linux-gnu) for application selection. (And WCG don't seem to support Anonymous Platform, so that's one difficulty out of the way!) This should mean that anything for a given WU that manages to download o.k, run, and return with a success state should not cause major trouble for the validator (and should usually validate!) In standard BOINC the validator wrapper (non-project-specific part) isn't interested in platform differences and I see no reason why WCG would have changed that. The wrapper doesn't care whether the project uses HR or not, and trusts the scheduler to have done the task allocation correctly at the time of the host's work request. It just passes validation-related information to the project-specific validator subroutine and deals with the success or failure return codes. Cheers - Al. |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1316 Status: Offline Project Badges:
|
TLD -- thanks for confirming successful use of a BOINC client running under WSL2.
I seem to recall seeing examples of part of some Windows-hosted scheduler requests that had a lot of supplementary tags specific to WSL and/or Docker. A request from any of my systems on an 8.x client has <platform_name>x86_64-pc-linux-gnu</platform_name>near the front of sched_request XML files and <os_name>Linux Ubuntu</os_name>at the end of the host_info section. And that's all there is that might relate to the current discussion... It would be interesting to know what appears in those two places in a WSL2 request to (say) CPDN and in a Windows request from the same host to WCG if it's running a Windows client as well, not only as a way to see OS Type and Version information but also to find out whether there are extra items that are related to the use of WSL2. I'd also be interested in what is there when a user has installed Docker, but I don't know whether any of our regulars here have that situation nowadays... Cheers - Al. |
||
|
|
TLD
Veteran Cruncher USA Joined: Jul 22, 2005 Post Count: 856 Status: Offline Project Badges:
|
It would be interesting to know what appears in those two places in a WSL2 request to (say) CPDN and in a Windows request from the same host to WCG if it's running a Windows client as well, not only as a way to see OS Type and Version information but also to find out whether there are extra items that are related to the use of WSL2. I'd also be interested in what is there when a user has installed Docker, but I don't know whether any of our regulars here have that situation nowadays... Cheers - Al. I don't use the Windows BIONC client at the same time as the Ubuntu BIONC client. edit: I have cloned the windows IP to WSL2. Ubuntu BOINC client on WSL2 CPDN - sched_request_climateprediction.net.xml <platform_name>x86_64-pc-linux-gnu</platform_name> <alt_platform> <name>i686-pc-linux-gnu</name> </alt_platform> <os_name>Linux Ubuntu</os_name> <os_version>Ubuntu 24.04.3 LTS [6.6.87.2-microsoft-standard-WSL2|libc 2.39]</os_version> <n_usable_coprocs>0</n_usable_coprocs> <wsl_available>0</wsl_available> WCG - sched_request_www.worldcommunitygrid.org <platform_name>x86_64-pc-linux-gnu</platform_name> <alt_platform> <name>i686-pc-linux-gnu</name> </alt_platform> <os_name>Linux Ubuntu</os_name> <os_version>Ubuntu 24.04.3 LTS [6.6.87.2-microsoft-standard-WSL2|libc 2.39]</os_version> <n_usable_coprocs>0</n_usable_coprocs> <wsl_available>0</wsl_available> Windows BOINC client - with WSL2 on it. CPDN - sched_request_climateprediction.net.xml <platform_name>windows_x86_64</platform_name> <alt_platform> <name>windows_intelx86</name> </alt_platform> <os_name>Microsoft Windows 11</os_name> <os_version>Core x64 Edition, (10.00.26100.00)</os_version> <n_usable_coprocs>1</n_usable_coprocs> <wsl> </wsl> WCG - sched_request_www.worldcommunitygrid.org.xml <platform_name>windows_x86_64</platform_name> <alt_platform> <name>windows_intelx86</name> </alt_platform> <os_name>Microsoft Windows 11</os_name> <os_version>Core x64 Edition, (10.00.26100.00)</os_version> <n_usable_coprocs>1</n_usable_coprocs> <wsl> <distro> <distro_name>Ubuntu</distro_name> <os_name>Ubuntu</os_name> <os_version>Ubuntu 24.04.3 LTS</os_version> <wsl_version>2</wsl_version> <is_default/> <libc_version>2.39</libc_version> </distro> </wsl> ![]() [Edit 1 times, last edit by TLD at Dec 2, 2025 11:19:27 PM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1316 Status: Offline Project Badges:
|
Thanks, TLD -- that's very thorough, and useful as it shows that proper set-up of a WSL2 client doesn't seem to put anything in the scheduler requests that might upset the server with only one BOINC client in play, whether the requests are from the Linux client or the Windows one...
----------------------------------------I suspect the same would be true on a system with both clients active at the same time as long as they have distinct host ids :-) I'm reminded of some discussions that took place when users were noticing that result reports for some items would flip to and fro between Linux and Windows -- whilst the cause of the server-side behaviour was identified then, I'm not sure we managed to totally pin the blame on Docker installs (good or bad). I also wonder how up-to-date the server needs to be to make use of anything in that wsl section of a scheduler request -- that, however, is unlikely to be of any import regarding the O/S flip-flop behaviour... Cheers - Al. P.S. -- over the past few days I've been seeing a fair few of my wingmen reporting in with the [WSL2] O/S details you've shown above; I've not checked to see how many different hosts might be involved, though -- I ought to do that when I have time... [Edit 2 times, last edit by alanb1951 at Dec 3, 2025 2:07:58 AM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1316 Status: Offline Project Badges:
|
Whilst checking on the latest set of Error tasks (all download errors, as expected!) I noticed that I've had quite a few Server Aborted retries early this morning (all for missed deadline tasks that clocked in late).
On examination, I found that several of them had an initial pair reporting Alpine Linux v3.21 or Linux Docker Desktop but actually running the 64-bit Windows application! When one of them got detected as No Reply at the deadline I happened to collect a Linux retry (as indicated by the O/S-derived platforms of the initial wingmen) -- they all went to a system running a 64-task cache and a maximum of 8 tasks at once, so none of them had started when the original result got reported late... As it seemed likely that my other systems might have got some similar retries I had a brief look at some of my returned retries on other systems and found one or two that will be informative at validation time ![]() The first one I checked was MCM1_0242978_2585 (WU ID 781531938) -- both initial tasks went to Windows systems, but one reported it was Docker Desktop and the other that it was Alpine Linux (but ran the Windows application)! The Docker Desktop failed (Not started by deadline, which WCG doesn't display as such), triggering the retry I received and processed with the 32-bit Linux application. As it is a retry, I suspect it might be several days before the validator has a go at it, but it'll be interesting to see whether it does validate. However, after a bit more digging I found another one that introduced another misidentification candidate, Fedora Linux 41 (Container Image)... This was MCM1_0242951_6988 (WU ID 781267674) which had one initial task reporting as Alpine Linux and one as Fedora Linux (but both running the 64-bit Windows application). The Fedora one missed the deadline, so I got yet another 32-bit retry which returned 90 minutes before the original task eventually reported in), so there were two Windows tasks pretending to be Linux and my genuine Linux task. All three validated ...It might be o.k. to miss out on Homogeneous Redundancy for MCM1 using the current CPU applications, but if the same happened to an ARP1 WU I don't think the outcome would be as favourable By the way, can someone tell me if Alpine Linux is what is reported for the original [default?] WSL? Cheers - Al. |
||
|
|
TLD
Veteran Cruncher USA Joined: Jul 22, 2005 Post Count: 856 Status: Offline Project Badges:
|
From the research I did before installing WSL2 you can shoehorn any Linux distro onto WSL but the standard ones (ready to install) can be installed from the Microsoft store here - https://apps.microsoft.com/search?query=WSL+distros&hl=en-US&gl=US
----------------------------------------![]() |
||
|
|
MJH333
Senior Cruncher England Joined: Apr 3, 2021 Post Count: 300 Status: Offline Project Badges:
|
By the way, can someone tell me if Alpine Linux is what is reported for the original [default?] WSL? Al,When I was having this problem, my OS was sometimes reported as Alpine Linux, sometimes as Docker Desktop and sometimes as Ubuntu (which was the flavour of Linux I had installed under WSL). Cheers, Mark |
||
|
|
Paul Schlaffer
Senior Cruncher USA Joined: Jun 12, 2005 Post Count: 278 Status: Offline Project Badges:
|
By the way, can someone tell me if Alpine Linux is what is reported for the original [default?] WSL? Cheers - Al. Alpine Linux is the recommended package download linked on the BOINC site, which is why I installed it. That's why you're seeing this. https://github.com/BOINC/boinc/wiki/Installing-Docker-on-Windows As I noted before, the install went well and it was operating fine with WSL2 enabled. In the BOINC event log, it shows as this: OS: Microsoft Windows 11: Professional x64 Edition, (10.00.26200.00) Memory: 127.87 GB physical, 135.87 GB virtual Disk: 32.23 GB total, 23.40 GB free Local time is UTC -5 hours Usable WSL distros: boinc-buda-runner (WSL 2) OS: Alpine Linux (Alpine Linux v3.22) Docker version 5.6.2 (podman) BOINC WSL distro version 4 Note there was a "default" in parenthesis after WSL2 on that line which I had to delete. The forum kept returning a forbidden error with that in. (See the bottom of the page on the link above) In the WU detail, the scheduler was sending the WU to both "Apline Linux" machines. The WU application was a Windows WU as it should be, and I verified the WU wasn't running under WSL2. Why WCG isn't showing Windows as the OS is something I like to know. If it's sending a Windows WU, I'd expect the WU detail to state Windows as the OS. If it's sending a Linux WU to operate under WSL2, then I'd expect it to reflect the distro installed. Last night, I rolled the installation back to 8.2.8 without WSL2 until this gets sorted out. While no errors were encountered and it was operating, I didn't want the scheduler to try and find a second Alpine Linux machine when sending a WU. Al, also thank you for all the responses.
“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
----------------------------------------[Edit 5 times, last edit by Paul Schlaffer at Dec 3, 2025 3:12:18 PM] |
||
|
|
flensr
Cruncher Joined: Oct 31, 2018 Post Count: 25 Status: Offline Project Badges:
|
It looks like I'm getting a bunch of WUs almost every day, but validation is still falling farther and farther behind. At a glance it looks like about 10-20% of my returned WUs are being validated every day, the rest going back to August 27 are still waiting.
----------------------------------------![]() |
||
|
|
|