Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 234
Posts: 234   Pages: 24   [ Previous Page | 12 13 14 15 16 17 18 19 20 21 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 24572 times and has 233 replies Next Thread
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

I've noticed all the really long runtime wu's have the letters " rig " in the file name. I just received another one of these work units, and without pause, commenced to hit the Abort button. My average of 113 pts/hr, with 42 Hrs of runtime should be more in the ball park of 4700 points, not the measly 66.2 points awarded. crying


The ones with "rig" in the name are typically work units that use flexible docking which makes them more complex (and so they take longer). Some of the non-"rig" work units did experience the problem, but the code cut both types of work units off at a max of 140 Vina jobs even when the estimator failed. For the more complicated flexible docking work units that meant a much higher run time than non-flexible docking work units.

Also, while we cleared out any batches of very large work units that hadn't been sent out yet, there isn't a good way to do that on an individual work unit basis short of just redoing the whole batch (which would also waste all the time spent on work units that were completed successfully). We realize this has been a major inconvenience and we are sorry for that. Sometimes there isn't a perfect solution, so please continue to bear with us while we work through these remaining long running work units.

Seippel
[Sep 28, 2014 4:57:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7662
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

Thanks for the further explanation. Hopefully we are nearing the end of these problematic units. Their length alone severely messes with the scheduler. Thanks for trying to get the situation under control. Perhaps a beta would have been in order when a change like this comes along. (Hindsight is always 20-20.)
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Sep 28, 2014 6:35:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

Sigh,,,,

FAHV_ x3ZSO_ B_ IN_ Y3b_ rig_ 0227625_ 0009_ 0-- HOME04 Pending Validation 14/09/22 21:36:47 14/09/28 04:14:51 57.93 / 64.78 87.8 / 0.0
FAHV_ x3ZSO_ B_ IN_ Y3b_ rig_ 0227573_ 0061_ 0-- HOME04 Pending Validation 14/09/22 21:36:47 14/09/28 06:23:07 64.78 / 72.85 87.8 / 0.0

I still have a couple of x3ZSOs in my queue.
[Sep 28, 2014 6:45:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
totoshi
Cruncher
Germany
Joined: May 18, 2007
Post Count: 1
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

I have a long WU as well:

FAHV_ x3ZSO_ B_ IN_ Y3a_ rig_ 0227425_ 0038_ 0-- - In Progress 22.09.14 01:50:52 02.10.14 01:50:52 0.00 0.0 / 0.0

My android crunched for about 54hrs with a result of ~ 25,7 %. I will never succeed it by Thurs. Should I crunch it to end or will I have to abort this long WU?
[Sep 28, 2014 7:07:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Eurwin
Cruncher
Joined: Apr 28, 2007
Post Count: 17
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

Thanks seippel for the update.
So if I understand it well, ALL RESENDS after the fix, shouldn't error out with the 131-error? In that case I'll let them run. It would be nice dough, if the time limit should be more then 4 day's for these repair jobs ....

Greetings
[Sep 28, 2014 7:27:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
seippel
Former World Community Grid Tech
Joined: Apr 16, 2009
Post Count: 392
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

Thanks seippel for the update.
So if I understand it well, ALL RESENDS after the fix, shouldn't error out with the 131-error? In that case I'll let them run. It would be nice dough, if the time limit should be more then 4 day's for these repair jobs ....

Greetings


Yes, but only if the resend was sent out after Sept 21, 23:15 UTC, if it was a resend that was sent out before this time it could potentially run into that problem (not all work units would run into error -131 even before the fix).

Seippel
[Sep 28, 2014 3:47:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

Guess they're not a -131 error, but the resends are still bombing out.

Now it's "Maximum elapsed time exceeded" and they are not being given credit. Two of those so far on my account.
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Sep 29, 2014 3:22:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

It seems to me a little unfair to be sending out bundles of units with 3-4 hour estimates that are taking more than 50 hours each to complete. (Win 8.1, Core i7/4770). If I were a teensy bit more cynical, I might think you were gaming the system.
[Sep 29, 2014 12:16:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

Problem after problem, 'made in wcg', the backroom. To get more time, screw down the fpops/mips values in the client_state.xml when agent has been stopped, then set the cc_config to skip the benchmark test whenever the regular routine would invoke one. The points are completely potty anyhow, so at least you get to compute the long tasks to the end, which is much more important than returning duds after 133 hours. No amount of running scripts to give credit for badly ended results would satisfy me.
[Sep 29, 2014 3:44:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Teglen
Cruncher
Joined: Nov 26, 2011
Post Count: 7
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Have very slow WU # FAHV_x3VQ7_IN_LEDGFa_rig_0220728_0005

I have a long WU as well:

FAHV_ x3ZSO_ B_ IN_ Y3a_ rig_ 0227425_ 0038_ 0-- - In Progress 22.09.14 01:50:52 02.10.14 01:50:52 0.00 0.0 / 0.0

My android crunched for about 54hrs with a result of ~ 25,7 %. I will never succeed it by Thurs. Should I crunch it to end or will I have to abort this long WU?

I have the same problem. After crunching for 45h it seems the WU will expire just 5 minutes before completion. crying
45h of work down the drain. I mean I couldn't care less about the points I'm loosing but that the World Community Grid will lose 45+ hours of work just because the expiration time is too short is driving me crazy. crying
[Sep 29, 2014 5:49:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 234   Pages: 24   [ Previous Page | 12 13 14 15 16 17 18 19 20 21 | Next Page ]
[ Jump to Last Post ]
Post new Thread