Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Africa Rainfall Project Thread: invalid |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 15
|
Author |
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
I have a new machine, and haven't run ARP on this machine before. I'm watching it closely.
----------------------------------------I have one valid result and one invalid result so far. This is the invalid one: https://www.worldcommunitygrid.org/contribution/workunit/650804920 I think it is due to the client version matching for the other two systems and mine being different. Just wanted to see if anyone had any different ideas. I don't think I'm sending back bad data, just think things don't match (in minor ways) across different OS and or client versions. Which isn't great, but nothing to worry about. UG I hope that doesn't make my system untrustworthy and prevent me from getting ARP. [Edit 1 times, last edit by Unixchick at Jan 21, 2025 6:25:08 PM] |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
I'm getting ARP WUs again. I'm still coming up with an invalid one here and there, and I can't see a reason why. I'm getting credit even for my invalid ones. It is strange.
|
||
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 337 Status: Offline Project Badges: |
I'm getting ARP WUs again. I'm still coming up with an invalid one here and there, and I can't see a reason why. I'm getting credit even for my invalid ones. It is strange. What does the Stderr say? (The link in your first post is invalid for me). |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
https://www.worldcommunitygrid.org/contribution/workunit/652259399
Here is a link to the most recent invalid. It doesn't give me any info, as the ones that are valid and invalid, look the same. I haven't found any clue to when it is invalid. |
||
|
William Albert
Cruncher Joined: Apr 5, 2020 Post Count: 36 Status: Offline Project Badges: |
https://www.worldcommunitygrid.org/contribution/workunit/652259399 Here is a link to the most recent invalid. It doesn't give me any info, as the ones that are valid and invalid, look the same. I haven't found any clue to when it is invalid. I wouldn't expect any useful log information from an Invalid WU, since if something went wrong during the processing in a way that actually produced a log entry, the WU would likely be logged as "Error" instead. "Invalid" is BOINC's way of saying, "You've finished processing the work, but the results you reported don't match with the results that others reported." While this could potentially be caused by a software issue that leads an application to produce different results when run on different platforms/hardware (and I don't think WCG makes computer information available in the way that other projects do, so that would be difficult for users to troubleshoot), the more likely explanation is that your hardware is not entirely stable. |
||
|
cliviafreak
Cruncher Joined: Jan 13, 2025 Post Count: 16 Status: Recently Active Project Badges: |
With floating-point computation different hardware always returns a slightly different result. It may be that your hardware is reporting just outside a specified range of tolerance (like x plus or minus 0.001% or something).
----------------------------------------I work with scientific computing software and when validating results we always have to do so within tolerance and we are just guessing until new hardware comes along and it goes outside that tolerance. At that point, our engineers have to go and recheck everything by hand (or use intuition) just to be certain we didn't make a mistake the first time, or the new hardware isn't crazy). Of course, that only applies if the project is using floating point and not exact math (which is possible, but exact math is quite slow). Basically, to "match" is a fuzzy concept. [Edit 2 times, last edit by cliviafreak at Jan 24, 2025 5:38:52 PM] |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
I think cliviafreak is right because sometimes I get a valid, and sometimes I don't. It depends on who I'm matched with. I hoped in posting this, that the techs would be aware of the issue and let me know if I should stop doing ARP or if it is ok.
|
||
|
William Albert
Cruncher Joined: Apr 5, 2020 Post Count: 36 Status: Offline Project Badges: |
With floating-point computation different hardware always returns a slightly different result. All processors that ARP would run on have floating-point hardware that complies with IEEE 754. I would not expect different results to computations -- even floating-point -- unless the computation depends on a random-number generator at some point. |
||
|
cliviafreak
Cruncher Joined: Jan 13, 2025 Post Count: 16 Status: Recently Active Project Badges: |
I admit, yes, if the hardware fully complies and strict mode is used at all levels from compiler to hardware, then the results shouldn't change. To turn on that strictness severely limits the performance of software.
----------------------------------------The reality, though, is that strictness is almost never fully adhered to and different hardware tries its best to comply. It never fully does. Further, compiler writers make mistakes. There are too many variables on the way from code->machine->execution to ensure floating-point computation is the same across all hardware and compilers. And people want the performance! Even scientists running important simulations. If it's so important that a few decimal places matter, they use arbitrary precision numbers. I'm not dismissing the possibility there is a fault in Unixchick's hardware, just that it is not the most likely explanation. [Edit 2 times, last edit by cliviafreak at Jan 24, 2025 10:08:17 PM] |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 859 Status: Offline Project Badges: |
My hardware is a brand new mac mini.
|
||
|
|