| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 7
|
|
| Author |
|
|
buscher
Cruncher Joined: Oct 3, 2011 Post Count: 5 Status: Offline Project Badges:
|
Disclaimer: Maybe this was already asked before, and I just can not find the thread, feel free to point me to the answer.
----------------------------------------Hello, I am a Linux (Gentoo) user, and while looking at the "Pending Validation" or "Valid" Work Units (WUs), I noticed that the other System the WU is sent to is always a Linux System too. Why do I "never" see a Windows system there? Is this on purpose? Does this have a technical reason? Can WUs calculated by a Linux system not be compared to WU calculated by a Windows system? I am not stating that this is bad in any kind of way (or that it must be changed), I am just curious "why" it is like this :) Thanks for the answer in advance! [Edit 1 times, last edit by buscher at Jan 22, 2023 5:17:12 PM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
There's a facility in BOINC called homogeneous redundancy that allows work management based on various aspects of host systems, including the Operating System (as observed here) and/or specific aspects of the actual processor in use.
The theory is that if validation is only against similar hardware on similar operating systems there should be less validation errors resulting from [subtle] differences in calculation outcomes, because of differences in hardware, libraries or generated code. In some cases it's essential - ARP1 is an example, because bit-wise identity of returned results is required, and the best chance of that is to have both a single O/S and "bit-ness" for all tasks for a given work unit (32-bit and 64-bit binaries use different floating point instruction sets...) In other cases it might be less necessary, depending on how validation is done... As far as I know, WCG always does this for CPU-based projects. Hope that helps. Cheers - Al. |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
Thanks for your insight, Al.
----------------------------------------You wrote: As far as I know, WCG always does this for CPU-based projects. The original question was asked with a Quorum=2 in mind and you answered that properly. This may be redundant to mention, nevertheless I would like to add that with Quorum=1 there might be a chance that a workunit can span more than one operating system ("OS type"). Here is an example from the past: Project Name: Smash Childhood Cancer You may wonder how this came about, what happened? Well, here is part 2 of the show, same workunit (when task data didn't fit on one line): Result Name Status Sent Time Time Due / Return Time Time Claim/Grant The task that errored out was reported as: "process got signal 7". (This looks like SIGBUS to me.) Adri [Edit 1 times, last edit by adriverhoef at Jan 23, 2023 1:11:01 AM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
Adri,
----------------------------------------I obviously didn't go into as much detail as I could have done about how homogeneous redundancy works :-) [1] I sometimes get MCM1 retries for work units where the first two both went out to Windows systems but they returned results with a non-success state close enough together that the generated retries became available to any platform because, as you noted, the forced platform selection does not apply if no tasks are still out in the field (and adaptive replication [as per your example] is probably the most common reason for platform switches...) Under normal circumstances I tend to see these when the Windows tasks go to Windows systems that fail with some variant of "can't start process", but it became somewhat more common when the download problems were at their height :-) This sort of stuff really belongs in a set of new [or reorganized] FAQs -- Sekerob and others did good work 12 or more years ago, but much has changed since then and some weeding out and re-writing of that stuff might be a future project(?) if only we can be sure that people would even look there in the first place! :-) Cheers - Al. P.S. In my earlier post I could/should also have noted that it is the reason we often see messages that tasks are committed to other platforms when doing work requests :-) [1] Quoting large sections of the BOINC Wiki entry didn't really seem like a good use of space for a single post :-) [Edit 1 times, last edit by alanb1951 at Jan 23, 2023 3:25:25 PM] |
||
|
|
Chief Gonzo
Cruncher Joined: Jul 3, 2008 Post Count: 15 Status: Offline Project Badges:
|
I'm getting a lot of errors on MCM, but only on one of the three computers that I am currently running, have no idea why or how to investigate. One thing I notice is that this computer is apparently incorrectly considered to be Windows 10 when in fact it was upgraded to Windows 11 some months ago. Might that cause minor differences in the results which then are flagged as errors?
|
||
|
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 384 Status: Offline Project Badges:
|
I'm getting a lot of errors on MCM, but only on one of the three computers that I am currently running, have no idea why or how to investigate. One thing I notice is that this computer is apparently incorrectly considered to be Windows 10 when in fact it was upgraded to Windows 11 some months ago. Might that cause minor differences in the results which then are flagged as errors? What type of errors? Could you link to some of the error tasks? Have you checked for hardware errors, e.g. run me test etc? |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
I'm getting a lot of errors on MCM, but only on one of the three computers that I am currently running, have no idea why or how to investigate. One thing I notice is that this computer is apparently incorrectly considered to be Windows 10 when in fact it was upgraded to Windows 11 some months ago. Might that cause minor differences in the results which then are flagged as errors? Very thin on information what kind of errors you are getting.As for the OS misidentification, that is an occasional BOINC client problem. I have a host that has been about 3 years ago upgraded to Windows 10, but to this day, it shows up as Windows 8.1 (which it originally had when new). Not that I really care, as that is inconsequential for actual operation... Ralf |
||
|
|
|