Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 18
|
![]() |
Author |
|
Seoulpowergrid
Veteran Cruncher Joined: Apr 12, 2013 Post Count: 817 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm finding multiple UGM WUs are sent to 3 computers within 45 minutes which ends in a race for whoever finishes first. A quick search for "aborted" shows me three WUs that I never started since two other ppl were able to finish the files first:
----------------------------------------ugm1_ugm1_20518_0033 ugm1_ugm1_20527_0360 ugm1_ugm1_20298_1778 It looks like I didn't waste any CPU time but bandwidth at 5 megs per WU will have ppl complain and servers at WCG are wasting cycles if WUs get sent to three people right off the bat. Anyone else finding this? Edit: 2 linux boxes, 1 Windows box. I checked a bunch of my valids and am not seeing those WUs being sent to more than 2 ppl unless someone detached/errored/or didn't reply. ![]() [Edit 1 times, last edit by Seoulpowergrid at Dec 3, 2015 1:48:47 PM] |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Post copies of the
----------------------------------------[Edit 1 times, last edit by SekeRob* at Dec 3, 2015 3:27:32 PM] |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7693 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I posted on this earlier in another thread :https://secure.worldcommunitygrid.org/forums/...ead,38555_offset,0#505428.
----------------------------------------Never really heard back on the issue. I have seen it more than once. Cheers
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Dec 3, 2015 3:36:14 PM] |
||
|
Seoulpowergrid
Veteran Cruncher Joined: Apr 12, 2013 Post Count: 817 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@Rob
----------------------------------------So for one of the instances I was the first to get the WU (WU ends in 0) and another I was the 3rd to get the WU. The last instance I can't check as WCG deleted it. @Sgt. Joe I was pretty sure I saw this before but never checked again as it looked like a one off. Today, as I saw three more aborted WUs, I decided to post this. ![]() ![]() |
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 772 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Found these two where I am -2:
----------------------------------------Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit ugm1_ ugm1_ 20546_ 0679_ 2-- - In Progress 03/12/15 09:24:17 10/12/15 09:24:17 0.00 0.0 / 0.0 ugm1_ ugm1_ 20546_ 0679_ 1-- 728 Pending Validation 03/12/15 09:18:39 03/12/15 11:39:00 2.33 71.9 / 0.0 ugm1_ ugm1_ 20546_ 0679_ 0-- - In Progress 03/12/15 09:18:17 10/12/15 09:18:17 0.00 0.0 / 0.0 ugm1_ ugm1_ 20550_ 2115_ 2-- - In Progress 03/12/15 12:26:59 10/12/15 12:26:59 0.00 0.0 / 0.0 ugm1_ ugm1_ 20550_ 2115_ 1-- - In Progress 03/12/15 12:07:04 10/12/15 12:07:04 0.00 0.0 / 0.0 ugm1_ ugm1_ 20550_ 2115_ 0-- 728 Server Aborted 03/12/15 12:06:40 03/12/15 12:43:11 0.00 0.0 / 0.0 Edit: added 2nd. Paul.
Paul.
----------------------------------------[Edit 1 times, last edit by PMH_UK at Dec 3, 2015 5:10:19 PM] |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
It's a pity they were not caught at point of distribution, to see the original deadlines. Those results are rather current, so it's not like troppo tardi and techs in a rush to get a quorum complete to finish an old lingering batch. "So happens" [quoting Bill], have two 8 cores on running a combined 13 UGM and 3 CEP2 which they've been doing for over a month now. I'll take out the flee-comb and see if any fresh show this, i.e. none have reported yet. Maybe something surfaces from this.
ttyl ... so 10 minutes later and opening up 45 IPs, only one showed with 1 returned results, 1 had a 3rd copy because an original went legs up, in amongst the IPs were copies of batch 20546 and 20550. Just wonder [Bills dartgun], if there's joysticks attached to the servers and a shoot them up [read wind them up], games are played, kidding of course. No raison d'être for this to happen, maybe one or the other host is spitting intermittent errors [wingman] soon after, and therefor some anticipating action is taken [extra copy], to make sure the server lifetime of a batch does not go long. Summation: Without techs fezzing, nothing to understand to let it rest [shut up and crunch ;o] (I admit, now that I have a solid Hunting Tool v2.0q (one with now 76000+ results having passed through and the other 12000), I hardly ever visit the RS pages... what I do not know, cannot concern me ;P) |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Wellll, seeing the added post by PMH_UK, # _2 has a normal 10 day deadline... more puzzle.
|
||
|
keithhenry
Ace Cruncher Senile old farts of the world ....uh.....uh..... nevermind Joined: Nov 18, 2004 Post Count: 18665 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Seeing this as well:
----------------------------------------Workunit Status Project Name: Uncovering Genome Mysteries Created: 12/01/2015 00:39:02 Name: ugm1_ugm1_20524_1200 Minimum Quorum: 2 Replication: 2 Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit ugm1_ ugm1_ 20524_ 1200_ 2-- 728 Valid 12/2/15 14:02:57 12/3/15 08:46:10 2.90 72.6 / 64.5 ugm1_ ugm1_ 20524_ 1200_ 0-- 728 Valid 12/2/15 13:58:53 12/2/15 16:00:41 2.00 56.4 / 64.5 ugm1_ ugm1_ 20524_ 1200_ 1-- 728 Server Aborted 12/2/15 13:58:43 12/3/15 11:00:39 0.00 0.0 / 0.0 I'm the _1 that got the WU before the _0. The _2 went out about FOUR minutes later. Either we have an anomaly in the spacetime continuum or BOINC is psychic! ![]() |
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 772 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
More oddity:
----------------------------------------Re-sent as _2 after error on _1 but also to me as _3 with _0 in progress. Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit ugm1_ ugm1_ 20660_ 1733_ 3-- - In Progress 07/12/15 03:24:04 14/12/15 03:24:04 0.00 0.0 / 0.0 ugm1_ ugm1_ 20660_ 1733_ 2-- 728 Pending Validation 07/12/15 03:21:25 07/12/15 06:38:04 3.25 102.6 / 0.0 ugm1_ ugm1_ 20660_ 1733_ 1-- 728 Error 07/12/15 03:19:04 07/12/15 03:21:07 0.00 84.8 / 0.0 ugm1_ ugm1_ 20660_ 1733_ 0-- - In Progress 07/12/15 03:18:40 14/12/15 03:18:40 0.00 0.0 / 0.0 Paul.
Paul.
|
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Don't know why any of the
Is 7 days the standard deadline these days(?), then those extras also go out with seven [a film I will never ever watch again]. Spooky. |
||
|
|
![]() |