Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 15
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 356 times and has 14 replies
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 859
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
invalid

I have a new machine, and haven't run ARP on this machine before. I'm watching it closely.

I have one valid result and one invalid result so far.
This is the invalid one: https://www.worldcommunitygrid.org/contribution/workunit/650804920

I think it is due to the client version matching for the other two systems and mine being different. Just wanted to see if anyone had any different ideas.

I don't think I'm sending back bad data, just think things don't match (in minor ways) across different OS and or client versions. Which isn't great, but nothing to worry about.

UG I hope that doesn't make my system untrustworthy and prevent me from getting ARP.
----------------------------------------
[Edit 1 times, last edit by Unixchick at Jan 21, 2025 6:25:08 PM]
[Jan 21, 2025 6:22:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 859
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

I'm getting ARP WUs again. I'm still coming up with an invalid one here and there, and I can't see a reason why. I'm getting credit even for my invalid ones. It is strange.
[Jan 24, 2025 5:43:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

I'm getting ARP WUs again. I'm still coming up with an invalid one here and there, and I can't see a reason why. I'm getting credit even for my invalid ones. It is strange.


What does the Stderr say? (The link in your first post is invalid for me).
[Jan 24, 2025 11:44:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 859
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

https://www.worldcommunitygrid.org/contribution/workunit/652259399

Here is a link to the most recent invalid. It doesn't give me any info, as the ones that are valid and invalid, look the same. I haven't found any clue to when it is invalid.
[Jan 24, 2025 4:45:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
William Albert
Cruncher
Joined: Apr 5, 2020
Post Count: 36
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

https://www.worldcommunitygrid.org/contribution/workunit/652259399

Here is a link to the most recent invalid. It doesn't give me any info, as the ones that are valid and invalid, look the same. I haven't found any clue to when it is invalid.


I wouldn't expect any useful log information from an Invalid WU, since if something went wrong during the processing in a way that actually produced a log entry, the WU would likely be logged as "Error" instead.

"Invalid" is BOINC's way of saying, "You've finished processing the work, but the results you reported don't match with the results that others reported."

While this could potentially be caused by a software issue that leads an application to produce different results when run on different platforms/hardware (and I don't think WCG makes computer information available in the way that other projects do, so that would be difficult for users to troubleshoot), the more likely explanation is that your hardware is not entirely stable.
[Jan 24, 2025 5:15:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cliviafreak
Cruncher
Joined: Jan 13, 2025
Post Count: 16
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

With floating-point computation different hardware always returns a slightly different result. It may be that your hardware is reporting just outside a specified range of tolerance (like x plus or minus 0.001% or something).

I work with scientific computing software and when validating results we always have to do so within tolerance and we are just guessing until new hardware comes along and it goes outside that tolerance. At that point, our engineers have to go and recheck everything by hand (or use intuition) just to be certain we didn't make a mistake the first time, or the new hardware isn't crazy).

Of course, that only applies if the project is using floating point and not exact math (which is possible, but exact math is quite slow).

Basically, to "match" is a fuzzy concept.
----------------------------------------
[Edit 2 times, last edit by cliviafreak at Jan 24, 2025 5:38:52 PM]
[Jan 24, 2025 5:36:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 859
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

I think cliviafreak is right because sometimes I get a valid, and sometimes I don't. It depends on who I'm matched with. I hoped in posting this, that the techs would be aware of the issue and let me know if I should stop doing ARP or if it is ok.
[Jan 24, 2025 9:26:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
William Albert
Cruncher
Joined: Apr 5, 2020
Post Count: 36
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

With floating-point computation different hardware always returns a slightly different result.


All processors that ARP would run on have floating-point hardware that complies with IEEE 754.

I would not expect different results to computations -- even floating-point -- unless the computation depends on a random-number generator at some point.
[Jan 24, 2025 9:38:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cliviafreak
Cruncher
Joined: Jan 13, 2025
Post Count: 16
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

I admit, yes, if the hardware fully complies and strict mode is used at all levels from compiler to hardware, then the results shouldn't change. To turn on that strictness severely limits the performance of software.

The reality, though, is that strictness is almost never fully adhered to and different hardware tries its best to comply. It never fully does. Further, compiler writers make mistakes. There are too many variables on the way from code->machine->execution to ensure floating-point computation is the same across all hardware and compilers.

And people want the performance! Even scientists running important simulations. If it's so important that a few decimal places matter, they use arbitrary precision numbers.

I'm not dismissing the possibility there is a fault in Unixchick's hardware, just that it is not the most likely explanation.
----------------------------------------
[Edit 2 times, last edit by cliviafreak at Jan 24, 2025 10:08:17 PM]
[Jan 24, 2025 10:04:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 859
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: invalid

My hardware is a brand new mac mini.
[Jan 24, 2025 11:46:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread