Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Cure Muscular Dystrophy - Phase 2 Forum Thread: Monster WU on the loose... |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 98
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
My wu finally finished after 39 hours and then turned into invalid.
Project Name: Help Cure Muscular Dystrophy - Phase 2 Created: 26-5-09 Name: CMD2_0002-RADIA.clustersOccur-RADIA.clustersOccur_1250 Minimum Quorum: 2 Replication: 2 Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 1250_ 2-- 614 Invalid 10-6-09 18:48:07 12-6-09 16:00:44 39.21 592.6 / 34.7 <= CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 1250_ 1-- 613 Valid 27-5-09 18:40:01 11-6-09 11:08:14 8.59 64.6 / 69.4 CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 1250_ 0-- 613 Valid 27-5-09 18:39:49 28-5-09 05:55:14 8.64 74.3 / 69.4 But it's the amount of credit granted what is really hurting, thank you WCG. Next time I encounter such a wu from h3ll, I will abort, that's for sure. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Sorry about that Baron, it's likely because the 6.13 and 6.14 are not compatible. What I think awful is, that you get the repair job and while working the "No Reply" comes still in 17 hours late. Had it not come in to form quorum with the first, a 4th copy would have been send, of 6.14 and yours very probably would have validated.
----------------------------------------Of little solace, the full crunch time is credited to your project stats for 'invalid' jobs. Unfortunately BOINC has never been enabled to see science versions, so repair jobs would also go out with the same. It always takes the newest. Me Myself, the client that is had one of the rare mini monsters... just not 6 hours under the 60% rule.
WCG Global & Research > Make Proposal Help: Start Here!
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Jun 13, 2009 3:07:18 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
My long repair unit just came in, and now the pending unit and mine are marked inconclusive and a second repair unit is waiting to be sent.
Project Name: Help Cure Muscular Dystrophy - Phase 2 Created: 5/27/09 Name: CMD2_0002-RADIA.clustersOccur-TPM1A.clustersOccur_2770 Minimum Quorum: 2 Replication: 3 Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit CMD2_ 0002-RADIA.clustersOccur-TPM1A.clustersOccur_ 2770_ 2-- 614 Inconclusive 6/11/09 17:36:48 6/13/09 15:53:10 36.27 691.7 / 0.0 CMD2_ 0002-RADIA.clustersOccur-TPM1A.clustersOccur_ 2770_ 0-- - No Reply 5/28/09 17:25:59 6/11/09 17:25:59 0.00 0.0 / 0.0 CMD2_ 0002-RADIA.clustersOccur-TPM1A.clustersOccur_ 2770_ 1-- 613 Inconclusive 5/28/09 17:25:55 5/31/09 12:34:15 3.65 55.6 / 0.0 CMD2_ 0002-RADIA.clustersOccur-TPM1A.clustersOccur_ 2770_ 3-- - Waiting to be sent — — 0.00 0.0 / 0.0 Question I think the first work unit came in as a partial unit and that there are now child units out there and crunching, and that more than likely my work unit is the whole completed job, if the second repair unit come in short like the first one what happens to my work unit, conversely if the second repair unit comes in LONG like mine what happens to the first one. It seems like the way the work units are set up allowing partial completion and child work units could have many cruchers spending time on units which may not give them any valid results like it did for Baron and possibly for myself depending on the outcome of the second repair unit, also what happens if the No Reply comes in now? |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
I would have a hard time to think the child was already generated before validation, so assume they have not been.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sorry about that Baron, it's likely because the 6.13 and 6.14 are not compatible. What I think awful is, that you get the repair job and while working the "No Reply" comes still in 17 hours late. Had it not come in to form quorum with the first, a 4th copy would have been send, of 6.14 and yours very probably would have validated. Of little solace, the full crunch time is credited to your project stats for 'invalid' jobs. Unfortunately BOINC has never been enabled to see science versions, so repair jobs would also go out with the same. It always takes the newest. Me Myself, the client that is had one of the rare mini monsters... just not 6 hours under the 60% rule. Thanks Sek. And good analysis! It is exactly so. Ah well, all for sience one would say. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The two last monsters ended up in CPU-time exceeded.
They ran 350.11 hours each, so 700 hours (29 days!!!) gone. They were 6.13 WU's however. The positive thing is: I don't have monsters now anymore, so let's crunch again! |
||
|
Van Fanel
Cruncher Joined: Dec 27, 2006 Post Count: 42 Status: Offline Project Badges: |
My last behemoth just died on me... LONG LIVE THE BEHEMOTHS!
Here go the specs: CMD2_ 0001-1HCI_ A.clustersOccur-1HCI_ A.clustersOccur_ 113955_ 4-- 613 Error 08/06/09 13:52:08 15/06/09 21:32:22 130.89 3,847.6 / 0.0 Meanwhile, this particular WU has been validated by two fortunate users already making use of app. version 6.14. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm running work unit CMD2_0002-RADIA.clustersOccur-RADIA.clustersOccur_321_3 using hcmd2 version 614. It is a repair unit - return date is 6/19 - two people errored on it (also using 6.14) ( "No state to restore. Start from the beginning.
called boinc_finish" message). There are now two of us working it again. It's 19.6% done - but the completion time (10:03) is increasing at more-or-less the same rate as the CPU time (9:39). I'm running 6.2.28 on Windows XP. I'm willing to let it run if it's got any chance to complete (I've very few errors, I'm due some wasted cycles). Should I let it run? Joe |
||
|
|