Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 17
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4115 times and has 16 replies Next Thread
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

And now that WU dropped off the list, without being accounted in any way, beside a waste of resources... sad
Ralf
----------------------------------------
[Edit 1 times, last edit by TPCBF at Sep 15, 2018 1:14:56 AM]
[Sep 15, 2018 1:14:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

And now that WU dropped off the list, without being accounted in any way,

Ralf,
I would suggest that you contact WCG about this. I'm pretty sure I'm not the only one who wants to know what happened there. It seems unfair to me if you don't get credited.
[Sep 15, 2018 1:28:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

And now that WU dropped off the list, without being accounted in any way,

Ralf,
I would suggest that you contact WCG about this. I'm pretty sure I'm not the only one who wants to know what happened there. It seems unfair to me if you don't get credited.
Well, it's still weekend over here, and that would include the WCG techs, maybe they see it when Monday morning comes along.
It's only a single WU, but frustrating never the less...

Ralf
[Sep 16, 2018 6:53:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

Apis said:
Couldn't this simply be solved with a _5 work unit to meet quorum?

Sure, and if that fails too? A _6? And a _7?

You´ve got to stop somewhere and try to sort out what´s going wrong. It´s balancing act between wasting participant´s resources on duff WU´s, and wasting tech time looking at problems that aren´t really problems. The way it is seems reasonable to me -- and I bet that the techs discuss if it should be changed every once in a while.

I get that and agree with it, but I think the logic is oversimplified. For example, in this case, we have a "Detached" result returned. I don't think "Detached" dispositions should count towards the overall limit as well as "User Aborted." Perhaps only bona fide "Error" dispositions should count.

So in the example being discussed in this thread, a _5 seems appropriate since the "Detached" disposition had nothing to do with the Work Unit itself and everything to do with somebody detaching from the project.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Sep 17, 2018 3:51:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

I think the logic is oversimplified. For example, in this case, we have a "Detached" result returned. I don't think "Detached" dispositions should count towards the overall limit as well as "User Aborted." Perhaps only bona fide "Error" dispositions should count.

I agree with you, but there may be an element of ¨How much logic do we want in the instruction path that´s executed so often, and for how much real benefit?¨ But, either way, it´s not my shout.
[Sep 17, 2018 10:04:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

The definition of TOO_LATE is that a result was returned to the server but due to the state of the workunit there was nothing the server could do with the result.

In this case, the result had reached its threshold for errors and the server had marked the job as failed. When this result was returned, there was nothing for the server to compare it to and so it was marked as TOO_LATE.

The default behavior of the BOINC software is that only results that are able to be checked earn credit. It was implemented this way in order to prevent cheating (for example, you can actually easily cause all of your results to be TOO_LATE by simply holding onto them until about 24 hours after the workunit they belong to have been validated). In such a situation, it would be easy for a person to earn credit that they didn't do any work for.

This unfortunately means that sometimes someone is part of a "hard luck" workunit like yours where multiple results failed for various reasons before yours was returned successfully. However, these situations are quite uncommon and so the strong majority of the time the logic works to create the correct incentives.

As others have noted, the threshold for errors is set to balance between rare situations like this where a likely good workunit encounters a variety of unusual errors vs allowing a workunit to continue to send out new copies when the workunit has a serious issue.
[Sep 17, 2018 1:27:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WU TOO LATE when the only one returned without error?

The definition of TOO_LATE is that a result was returned to the server but due to the state of the workunit there was nothing the server could do with the result.

In this case, the result had reached its threshold for errors and the server had marked the job as failed. When this result was returned, there was nothing for the server to compare it to and so it was marked as TOO_LATE.

The default behavior of the BOINC software is that only results that are able to be checked earn credit. It was implemented this way in order to prevent cheating (for example, you can actually easily cause all of your results to be TOO_LATE by simply holding onto them until about 24 hours after the workunit they belong to have been validated).

Right, if you hold onto WUs for, let's say, over 24 hours, then they will be marked TOO_LATE. Fair enough.

In such a situation, it would be easy for a person to earn credit that they didn't do any work for.

Understood.
But ... let's return to the case where we are not cheating, because this wasn't about WUs that are returned 24 hours after the deadline, and more importantly, the BOINC-server knows which WUs have been sent to which machine-IDs, isn't it?

I'm still missing the clarification of what happens to WUs when a WU successfully completes within 24 hours, but runs into a TOO_LATE situation (the desired quorum isn't met, the BOINC server can't compare the result to any other).

This unfortunately means that sometimes someone is part of a "hard luck" workunit like yours where multiple results failed for various reasons before yours was returned successfully.

So, within the 24 hour deadline, the BOINC-server still knows which WUs have been sent to which machine-IDs. I would say that the results that were returned successfully, but got marked TOO_LATE within 24 hours, would deserve credit in the end, if only these WUs are sent to a take-out list and get re-computed and the machine-IDs that were involved are remembered.

I mean, the WUs that are marked TOO_LATE within 24 hours are generally sent to a take-out list to get re-computed afterwards, aren't they? So the BOINC server knows to which machine-ID they belong. If you store this information and re-compute the WUs, then you can give credit where it is due after the re-computation from the take-out list is done.

I really would like to know what happens there.
[Sep 19, 2018 12:44:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread