| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 140
|
|
| Author |
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
I've got one of the monsters as well. BETA_CMD2_0001-PP1BA.clustersOccur-TPM1A.clustersOccur_31 Currently been running for 12hrs and at 9.4% complete! This is running on an Core i7 Extreme (4GHz) hyperthreading disabled, 100% useage,12 Gb RAM. Yes sprigo, I have referenced your post in the opening post of this thread. And it seems that your numbers are close to mine, or even slightly worse as far as the computed runtime is concerned (~130 hours). What I cannot see because your previous numbers were probably rounded is if this computed total runtime is still increasing as time goes, i.e. if the computing is getting slower and slower. Thank you for your reporting. Jean. |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
GIBA,
----------------------------------------You are giving lots of information which confirms that your WU is running fine beside its slowness. Unfortunately you don't say anything about runtime and % done at various stages which is the only way to see if your WU falls in the same category as others mentioned in this thread. The only visible abnormal aspect of these WUs is that they are running slower and slower, not only slowly, and thus that they should normally never reach 100 %. For checking this point, Boinc's time to completion is totally useless as I have explained in previous posts. Cheers. Jean. |
||
|
|
GIBA
Ace Cruncher Joined: Apr 25, 2005 Post Count: 5374 Status: Offline |
GIBA, You are giving lots of information which confirms that your WU is running fine beside its slowness. Unfortunately you don't say anything about runtime and % done at various stages which is the only way to see if your WU falls in the same category as others mentioned in this thread. The only visible abnormal aspect of these WUs is that they are running slower and slower, not only slowly, and thus that they should normally never reach 100 %. For checking this point, Boinc's time to completion is totally useless as I have explained in previous posts. Cheers. Jean. Jean was fault not mentioned, once I reported at time that I saw the WU. Follow I put my anotation: Run time - % done - time to completion 26:03 hours - 30.01% - 34:05 hours 27:53 hours - 32.53% - 34:14 hours I will monitor and report future evolution. Tks. ![]()
Cheers ! GIB@
![]() Join BRASIL - BRAZIL@GRID team and be very happy ! http://www.worldcommunitygrid.org/team/viewTeamInfo.do?teamId=DF99KT5DN1 |
||
|
|
smeyer55
Senior Cruncher Joined: Feb 15, 2009 Post Count: 303 Status: Offline Project Badges:
|
I'm running BETA_CMD2_0001-PP1BA.clustersOccur-SMAD4A.clustersOccur_11 on an i7 and it is currently showing 13.872% done after 12:08:33.
It's estimating 16 hours to completion, which is unlikely at the current rate. It does seem to be speeding up somewhat since this morning it had used 6 hours to get to 4%. Using linear math it should take around 87 hours total to complete, which will be past the deadline. This looks like a slightly different series than the one listed in the thread topic, so there may be more than 1 "monster" series. On this same machine an HCC WU takes 4.5-5 hours to complete. Three other beta CMD2 WUs on this machine took 8.88, 9.54, and 10.46 hours to complete. Steve |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I think you got someone's attention.
I had 2 Beta's waiting to crunch that just reported cancelled by project and disappeared. Can't even confirm if they were part of the monster batch as I hadn't gotten that far yet. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
What to do with such large WU where the deadline for returning the result elapse ?
I mean if I have to calculate for example 100 hours and the deadline is 70 I would waste the 70 hours ? |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Thank you for reporting. I have extended the deadline for these workunits on the server side to 7 days. The client will not be notified, but you will be able to see it on the website.
We do need the data on how long these workunits are running in order to improve our estimation model - however the workunits will start to exit for hitting the max flops limit at around 67 hours for an average size computer. If you see a workunit that is projected to go beyond that limit, then you need to decide whether to abort or not. We have extended the max flops limit for the workunits, but again currently assigned workunits will not see it. The new limit is around 670 hours for an average size computer. Thus if you cancel the workunit, the next recipient will get longer time to compute it. Unfortunately - I cannot push a change in the limit down to you. |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
Kevin,
----------------------------------------Thanks for the news, but I am afraid your changes will not solve the problem for those particular slowing down WUs, unless you have reasons to expect that at some stage they should resume computing at a reasonable speed. With the current behavior there will be a time where the computing speed will be virtually null and there will no longer be any visible progress at all. I have left mine run while waiting for news from you and also to see if there would be a stage where the computed total runtime would stop to increase, but nope. Three more hours have passed since my report at 10 % and during these three hours the computed total runtime has increased by 13 hours reaching now 132 hours vs 119 previously, and the % done is only 11.285 % . As long as this trend does not change there is strictly no hope to reach a normal completion. I had just suspended this WU and made a copy of its slot directory just before seeing your post. I will keep this backup in case it can help you but I shall abort the WU now: there is no hope with its current limits, and probably none either with your new ones. Sorry for this dark description of the situation, but that's beta life, no real harm done as far as I am concerned. Jean. |
||
|
|
p3nguin53
Advanced Cruncher USA Joined: Dec 8, 2008 Post Count: 95 Status: Offline Project Badges:
|
I have extended the deadline for these workunits on the server side to 7 days. The client will not be notified, but you will be able to see it on the website. Did your deadline change affect all beta WU's or just the ones with similar names as in the thread heading? My Beta WU originally had a three day deadline. Now it has an 7 hr deadline. What caused the deadline change with this WU? Should I abort it? Based on it's current time spent (17:31h and 27.53%done), it will take 63.6 total hours to complete. It's running on a P4 HT against a Rice WU. BETA_ CMD2_ 0001-DHRS3.clustersOccur-MYH2A.clustersOccur_ 84_ 2-- In Progress 4/24/09 20:33:29 4/25/09 03:33:29 0.00 0.0 / 0.0 BETA_ CMD2_ 0001-DHRS3.clustersOccur-MYH2A.clustersOccur_ 84_ 1-- In Progress 4/24/09 20:33:23 4/25/09 03:33:23 0.00 0.0 / 0.0 BETA_ CMD2_ 0001-DHRS3.clustersOccur-MYH2A.clustersOccur_ 84_ 0-- In Progress 4/24/09 20:33:01 4/25/09 03:33:01 0.00 0.0 / 0.0 |
||
|
|
sprigo
Cruncher England Joined: Apr 30, 2007 Post Count: 37 Status: Offline Project Badges:
|
I've got one of the monsters as well. BETA_CMD2_0001-PP1BA.clustersOccur-TPM1A.clustersOccur_31 Currently been running for 12hrs and at 9.4% complete! This is running on an Core i7 Extreme (4GHz) hyperthreading disabled, 100% useage,12 Gb RAM. It looks as if the projected time for this unit is pretty constant now at 127hrs! ![]() |
||
|
|
|