| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 3596
|
|
| Author |
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
33793_102 has received the required 2 results so you are not waiting on a third.
41 hours is not surprising when the timestep has been reduced. 36 secs is normal timestep. 24 secs could take up to 50% longer and 18 secs could be up to double normal time. I suspect that adri's explanation may not cover the situation where 3 copies are sent out with a quorum of 2. It may be that a third copy would time out on the deadline if the other two have been returned within the deadline. When 2 copies are sent out with a quorum of 2, which is the standard, if one copy is returned before the deadline and the other not, then a third copy is sent out. The second copy would then have until the third, or later, copy is returned. That might be before the 24 hours or might be later, depending on how fast the machine is. I am not sure whether a delayed validation might help on the deadline. If the second copy is returned before the third copy reaches its first checkpoint then that third copy would be aborted by server, otherwise it would be allowed up to its deadline to finish and count. The credits, if any, for the third to finish would be based on the first two results returned. Mike |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 1018 Status: Offline Project Badges:
|
Hi!
What about this one: ARP1_0033870_119 The _4 is mine. Project name: Africa Rainfall Project Created: May. 7, 2023 - 0:04 UTC Name: ARP1_0033870_119 Minimum Quorum: 2 Replication: 3 Result name OS type OS version Status Sent time Time due/ Return time Cpu time/ Elapsed time Claimed credit/ Granted credit ARP1_0033870_119_0 Microsoft Windows 10 Core x64 Edition, (10.00.19045.00) No Reply 2023-05-07 0:15:33 UTC 2023-05-08 12:15:33 UTC ARP1_0033870_119_1 Microsoft Windows 10 Core x64 Edition, (10.00.19045.00) Pending Validation 2023-05-07 0:15:35 UTC 2023-05-08 10:35:53 UTC 13.05 / 13.92 550.7 / 0 ARP1_0033870_119_2 Microsoft Windows Server 2016 Datacenter x64 Edition, (10.00.14393.00) Pending Validation 2023-05-07 0:15:57 UTC 2023-05-08 23:17:26 UTC 19.08 / 19.08 656.1 / 0 ARP1_0033870_119_3 Microsoft Windows 11 Education x64 Edition, (10.00.22000.00) Pending Validation 2023-05-08 12:15:40 UTC 2023-05-09 08:04:52 UTC 13.86 / 13.86 119.2 / 0 ARP1_0033870_119_4 Microsoft Windows 10 Professional x64 Edition, (10.00.19045.00) Pending Validation 2023-05-08 12:16:05 UTC 2023-05-09 03:11:36 UTC 14.03 / 14.32 524.2 / 0 Thanks for any explanation! Hans S. |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
Hans
My post related to what IBM used to do. Krembil don't seem to have read that bit. Copies 3 & 4 were not needed as the quorum had already been returned before they were sent out. Mike |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2355 Status: Recently Active Project Badges:
|
Hans:
----------------------------------------Result name OS type Status Sent time Due / Return time CPUtime/Elapsed Claimed/Granted Thanks for any explanation! Let me guess: you are wondering why they don't get validated.Probable cause: it seems that validation for ARP1 is UPDATE! One ARP1-task just got validated. Adri PS - If you've got some spare time, you may want to spend a little of your time on a quick quiz here. ![]() PPS Thanks for your addition of that Extreme workunit! [Edit 3 times, last edit by adriverhoef at May 9, 2023 11:07:43 PM] |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 1018 Status: Offline Project Badges:
|
Thank You Mike and Adri for Your answers ☺
----------------------------------------Sorry for not being more precise in my question! Hans P:. I've just read and learned even more in the thread https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,45300 !! PPS: @Mike You picked a nice example to explain the problen(s) 😁 [Edit 2 times, last edit by Hans Sveen at May 10, 2023 8:18:29 AM] |
||
|
|
Feenix3k
Cruncher Joined: Feb 8, 2006 Post Count: 2 Status: Offline Project Badges:
|
I have not gotten any work in the past few days . Is their a problem?
|
||
|
|
siu77
Cruncher Russia Joined: Mar 12, 2012 Post Count: 23 Status: Offline Project Badges:
|
Check this out! A lot of wingmans.
ARP1_0034587_019_0 Microsoft Windows 10 Professional x64 Edition, (10.00.19045.00) Pending Validation 2023-05-06 15:26:42 UTC 2023-05-09 02:58:16 UTC 12.83 / 12.83 348.6 / 0 ARP1_0034587_019_1 Microsoft Windows 10 Professional x64 Edition, (10.00.17763.00) Pending Validation 2023-05-06 15:26:36 UTC 2023-05-08 23:47:51 UTC 30.73 / 33.71 737 / 0 ARP1_0034587_019_2 Microsoft Windows 10 Enterprise x64 Edition, (10.00.19042.00) No Reply 2023-05-06 15:27:39 UTC 2023-05-08 03:27:39 UTC ARP1_0034587_019_3 Microsoft Windows 11 Education x64 Edition, (10.00.22000.00) Pending Validation 2023-05-08 03:26:59 UTC 2023-05-09 07:42:08 UTC 11.78 / 11.82 512.4 / 0 ARP1_0034587_019_4 Microsoft Windows 8.1 Professional x64 Edition, (06.03.9600.00) Pending Validation 2023-05-08 03:27:17 UTC 2023-05-09 03:36:45 UTC 17.97 / 19.07 598.6 / 0 ARP1_0034587_019_5 Microsoft Windows 10 Professional x64 Edition, (10.00.19045.00) Pending Validation 2023-05-08 03:27:49 UTC 2023-05-09 04:42:55 UTC 14.51 / 15.06 842 / 0 https://www.worldcommunitygrid.org/contribution/workunit/301547113 Apparently, the server that sends tasks to users does not suspect that the tasks are already being checked. ![]() |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1312 Status: Offline Project Badges:
|
Feenix3k - ARP comes in feast or famine. There haven't been many ARP WUs go out for a couple of days. I'm guessing we will hit a big blob of resends soon.
|
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1332 Status: Recently Active Project Badges:
|
siu77
----------------------------------------Apparently, the server that sends tasks to users does not suspect that the tasks are already being checked. Not so! :-) This work unit is for a grid cell that is so far behind in the generation race that it is initially sent out to three clients instead of two (and it is supposed to only send them to systems that respond quickly!), and the tasks are given a shorter deadline (1.5 days)[1]; under normal circumstances two out of three will probably return within that deadline but sometimes that doesn't happen, as here... Unfortunately, the simple work-unit view doesn't show the due time once the result is returned. However, in this case, all three of the initial tasks failed to make deadline (problems with downloading/uploading may have been a factor here); one missed by 20 hours, another by nearly 24 hours and the last one by well over a day! The central BOINC system doesn't know that a result has been returned until it has been reported by the client; if it sees tasks that have passed deadline without reporting in it flags them as No Reply and sends a retry if the quorum hasn't yet been met. That would have happened for all three of these tasks :-) This topic comes up occasionally but it can be quite hard to find in such a large forum thread - I hope this helps clear it up in this case! Cheers - Al. [1] ARP1 work units are divided into three categories, Extreme, Accelerated and Normal, based on how far behind the most advanced generation they are -- this classification is used to decide how many initial tasks to send out and what initial deadline to assign. At present, Extreme units get 3 initial tasks and a deadline of 1.5 days, Accelerated units get 2 initial tasks and a deadline of 3 days, Normal units get 2 tasks and a deadline of 6 days, [Edited to add the footnote...] [Edit 2 times, last edit by alanb1951 at May 10, 2023 8:08:38 PM] |
||
|
|
siu77
Cruncher Russia Joined: Mar 12, 2012 Post Count: 23 Status: Offline Project Badges:
|
Thanks for the clarifications, Al. Now it's clear. Mostly.
Except the fact that validation takes so long. Maybe because it takes a bit longer to compare 6 units than 2 units. |
||
|
|
|