Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 36
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sek - I can report a high number of DDDT errors after completion on ALL my machines since yesterday, both 0850 and 0860 series, probably some 300 in total so far
![]() I see that no DDDT work is coming through as of now - different project work is being sent instead so the Techs are obviously aware of the problem. I know I for one have many DDDT waiting in the queue, so can I suggest the Techs send out an automatic abort script so folks don't waste time crunching these and lose credit? - and maybe a Tech announcement somewhere regarding this problem and how they want us to deal with WU's already downloaded, particularly regarding machines being 'penalised' by these errors and receiving less work? I am also wondering whether WCG may also consider giving credit for these errored units so we don't lose out ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I believe the scientists have been playing around with the size of the work units, recently. They tried some smaller work units, but they unbalanced the scheduler to the point where WCG had to dial DDDT down to almost nothing. So, WCG asked the scientists to increase the size of the work units.
I think for AutoDock this would be done by packaging two or more docking attempts into one work unit. Changing the docking parameters will also change work unit size, but that's the cause of the problem, not the solution. The new docking parameters lead to small work units. Looks like there is a problem with the new packaged work unit. But this is mostly speculation based on the information in this thread - I haven't deconstructed a work unit to check. WCG usually give retrospective credit for bad batches. Remind them again if it doesn't appear. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
The abort and resend may already be in hand as looking at Chris' post, the 2 new copies were send within the minute and not upon immediate return (gap of 8 and 17 hours respectively).
----------------------------------------Yes, we know, the techs don't talk much and leave us to construe one or the other. ![]()
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks D - appreciate the info
![]() Sek - I don't think auto abort is working at my end , all my machines are just crunching away through them in the queue - I would really appreciate some guidance on whether to just manually abort them all, or wait ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Well as always it's night out there so can only suggest you sample the Result Status page and crunch on for those that show with a 'pending validation' copy already returned. Used BOINCview to suspend all in queue but dddt jobs that had PV's on them so will know in an hour or two.
----------------------------------------Of course, activate other projects and bounce up additional buffered days to get jobs of other make.... i like the all project mix also because it allows for a quick switch with work on hand when something is developing. ttyl
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Forgot its still night over there - even Techs need to sleep
![]() ![]() I will put them all in suspend and await any update... |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
This is very strange. The 864 one from my earlier post that had already 1 Pending Validation on it turned both into error.... ooodd. All the 865's returned, validated and a few stuck in PV absent quorum.
----------------------------------------dddt0201o0864_ ZINC00229793-0000_ 00_ 1-- Error 12/11/2007 20:15:24 12/12/2007 12:15:08 1.11 16.9 / 0.0 dddt0201o0864_ ZINC00229793-0000_ 00_ 0-- Error 12/11/2007 20:13:36 12/12/2007 05:37:51 2.33 18.1 / 0.0 dddt0201o0864_ ZINC00229793-0000_ 00_ 2-- Waiting to be sent — — 0.00 0.0 / 0.0 dddt0201o0864_ ZINC00229793-0000_ 00_ 3-- Waiting to be sent — — 0.00 0.0 / 0.0 dddt0201o0865_ ZINC04363233-0000_ 00_ 1-- Valid 12/11/2007 22:06:41 12/12/2007 12:15:08 1.10 16.7 / 15.8 dddt0201o0865_ ZINC04363233-0000_ 00_ 0-- Valid 12/11/2007 22:01:58 12/12/2007 03:23:23 0.98 14.9 / 15.8 Other strangeness is, same as yesterday that the midday stats have not started yet. Thid minute can still get into the Results pages. ttyl. PS: Now it's my turn.... "I want my credit" ![]() ![]() PPS, not sure, but vaguely think to remember we had a similar case and on revalidation the jobs passing the test ![]()
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Dec 12, 2007 12:29:12 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I had two error status results today, one of which is shown below. The other was similar; task id was dddt0201o0864_ ZINC02219598-0000_ 00_ 0-- .
dddt0201o0861_ ZINC07662347-0000_ 00_ 0-- Error 12/11/2007 18:27:21 12/12/2007 06:20:11 0.79 12.3 / 0.0 dddt0201o0861_ ZINC07662347-0000_ 00_ 1-- Error 12/11/2007 18:24:47 12/12/2007 02:42:44 0.82 12.7 / 0.0 dddt0201o0861_ ZINC07662347-0000_ 00_ 2-- Waiting to be sent — — 0.00 0.0 / 0.0 dddt0201o0861_ ZINC07662347-0000_ 00_ 3-- Waiting to be sent — — 0.00 0.0 / 0.0 |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We are looking at this now. I'll post something when I have some information for you.
|
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It looks like the validator for Dengue Fever decided to mark a bunch of results in error. We are looking into why that happened now so that we can prevent it in the future. However, we will be able to re-validate those workunits so everyone should be able to get their credit and have the worked used correctly. I'm working on that now.
|
||
|
|
![]() |