| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 24
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have a number of WUs sent out late 15th/early 16th June which are sitting in PV land, with the wingman's WU waiting to be sent. Are these because of changes along the way in the machine groupings? And what will happen to them?
----------------------------------------e.g. CMD2_ 0009-CUL1A.clustersOccur-MYH6.clustersOccur_ 105_ 1-- 614 Pending Validation 16/06/09 03:14:00 16/06/09 10:41:23 4.04 20.8 / 0.0 CMD2_ 0009-CUL1A.clustersOccur-MYH6.clustersOccur_ 105_ 0-- - Waiting to be sent — — 0.00 0.0 / 0.0 [edited for more descriptive title] [Edit 1 times, last edit by Former Member at Jun 22, 2009 1:27:14 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Now there's a title that declares the load, for had I not opened it in sheer boredom, would anyone had thought it was the 999th case of a PV jail report.
----------------------------------------Don't know, but venture to think that similar to validation being pushed for complete quorums, the routine starts pushing out copies to force the issue. The first round is when the original due date has been reached for your task. Since this is a parent, that would have been 14 days. Child results have 10 days deadline. Let's see what the techs say. You're 8 hours ahead of me and I'm 7 hours ahead of them.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There's another big bunch of them 21st/22nd.
e.g. CMD2_ 0014-CCNB1A.clustersOccur-MOESA.clustersOccur_ 12_ 1-- 614 Pending Validation 22/06/09 06:35:57 22/06/09 14:19:58 4.00 18.6 / 0.0 CMD2_ 0014-CCNB1A.clustersOccur-MOESA.clustersOccur_ 12_ 0-- - Waiting to be sent — — 0.00 0.0 / 0.0 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The first round is when the original due date has been reached for your task. Since this is a parent, that would have been 14 days. Pretty much. The first one has been sent 13 days 8.5 hours after mine was, and with a 14 day deadline. |
||
|
|
Sid2
Senior Cruncher USA Joined: Jun 12, 2007 Post Count: 259 Status: Offline Project Badges:
|
Now there's a title that declares the load, for had I not opened it in sheer boredom, would anyone had thought it was the 999th case of a PV jail report. Could this be a slick and easy solution to the PV limbo? Award a few, say 5, points to WU's that are returned within 2 days. With 2 humble computers, I usually have 3 pages of PV's. . . sometimes 4. . . . I believe this would keep the caches down and marshall the WU's faster. ![]() |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
This is very similar to the same situation that occurs with Rice.
For HCMD2 we create an initial set of workunits for each batch based on the 'estimate' of how long they will take (I use the term estimate very lightly). For each successive generation of workunits, we increase the priority of those workunits. This helps complete batches quickly because higher priority results will be distributed before lower priority results. This is why you may periodically see a delay before the second result in a quorum gets distributed. This should rarely last longer than 6 hour and very rarely more than 12 hours. Gen 1: 0 priority Gen 2: 4 priority Gen 3: 8 priority --------------------- Gen 4: 12 priority etc Once a result is set to a priority of 10 or higher, it will only be assigned to a 'reliable' computer. The way we have things set up, 4th generation and higher will be sent to reliable computers. This also helps to complete the batch quickly. Hope this answers some of the questions about what is going on. |
||
|
|
Mysteron347
Senior Cruncher Australia Joined: Apr 28, 2007 Post Count: 179 Status: Offline Project Badges:
|
Seems like a sensible scheme, aimed at minimising the number of work units simultaneously in progress.
But unfortunately it doesn't appear to address the original question. The second task in the quorum still had not been sent on June 22nd (date of original post) - six full days after the first task in the quorum had been returned. That's not 6 hours or even 12 - it's 6 DAYS or 144 hours. Why this (ahem) gross delay? Shouldn't the second task for the work units have been sent out around the same time as the first? Certainly, the higher priority jobs should be sent out before the lower - but the quorum for any particular WU should be sent before any further WUs are commenced. Does this indicate that there is a problem that has caused the number of WUs in progress to be greater than necessary? Or, to put it into the generation/priority model, once a WU is commenced, shouldn't the tasks remaining in its quorum be raised to priority 2 so that they get despatched AFTER gen-1 but BEFORE gen-0? |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This is why you may periodically see a delay before the second result in a quorum gets distributed. This should rarely last longer than 6 hour and very rarely more than 12 hours. However, as Sekerob correctly guessed, these ones are being delayed for about 14 days. Given the 14 day deadline on the second WU, if that second machine fails to respond, it could easily be a full month before there is a result validated. |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
Given the 14 day deadline on the second WU, if that second machine fails to respond, it could easily be a full month before there is a result validated. If that second machine fails to respond it will not be after 14 days but after 5.6 days (40 %), and since this second machine is supposed to be a "fast returner" it is more likely that it will return its result within one day. Also note that parent units seem to be back to a 10-day deadline like the others now. Cheers. Jean. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If that second machine fails to respond it will not be after 14 days but after 5.6 days (40 %) What part of "The first one has been sent 13 days 8.5 hours after mine was, and with a 14 day deadline" didn't you understand? These are wingmen sent out very late, not make-up jobs. e.g. CMD2_ 0009-1CEE_ B.clustersOccur-3C0G_ A.clustersOccur_ 2_ 1-- - In Progress 29/06/09 19:32:19 13/07/09 19:32:19 0.00 0.0 / 0.0 CMD2_ 0009-1CEE_ B.clustersOccur-3C0G_ A.clustersOccur_ 2_ 0-- 614 Pending Validation 15/06/09 23:56:19 16/06/09 07:18:41 4.00 36.8 / 0.0 |
||
|
|
|