| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 22
|
|
| Author |
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
I've noticed that the average runtime of CEP2 WUs has been gradually incresing, and Sekerob's Average Run Time by Day & Project chart confirms this. There seems to have been a particular increase in WU times that is not yet reflected in the chart.
----------------------------------------My devices normally complete all 16 sub-jobs in the WUs, but some are now stopping after 12 sub-jobs. For the first time, my one device that uses HT (longer WU times) has had 3 WUs bump into the 12h cutoff, and these events happened in the last 2 days. They all stopped near the end of Job #15 after spending 3h21m, 3h08m and 4h25m respectively on their aborted final sub-job. I assume that this CPU time will be discarded. In all 3 cases, Job #15 was much longer than any of Jobs #0 - #14. Jobs #14 took 1h43m, 1h45m and 1h43m respectively. On the first 12h WU, the wingman completed all 16 jobs in 11h39m, after spending 3h10m on Job #15, so I was close to finishing too. The other 2 WUs are still in PV status. I guess that this is "inevitable" wastage of the current WCG CEP2 system. As the average CEP2 task time increases, more WUs will be cut at 12h and the wastage will increase. It might be good if the system could be tweaked somehow to make the cut-off limit more flexible, eg to stop WUs at the end of a sub-job if their projected run-time will go too near the 12h limit, and perhaps allow an unexpectedly long sub-job to run a little past 12h. I haven't studied the distribution of sub-job times within CEP2 WUs, particularly as to whether job #15 is usually much longer than the others. If this has only just started to happen, perhaps it should be investigated. More Details: ======== 1st 12h WU name: E203058_ 040_ C.28.C21H12N4OSSi.00140500.4.set1d06_1 ------ 2nd 12h WU name: E203061_ 097_ C.27.C23H14OS2Si.00097882.4.set1d06_2 ------ 3rd 12h WU name: E203062_ 869_ C.28.C22H12N2O2SSi.00011746.0.set1d06_1 End of its result log: [17:42:54] Starting job 14,CPU time has been restored to 21229.984375. [19:25:43] Finished Job #14 [19:25:43] Starting job 15,CPU time has been restored to 27379.656250. Killing job because cpu time has been exceeded. Subjob start time = 0, Subjob current time = 1088077034 [23:50:33] Finished Job #15 23:50:42 (6604): called boinc_finish [Edit 1 times, last edit by Rickjb at Aug 30, 2011 6:34:23 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Rickjb, please edit link to chart to capitals. bit.ly is case sensitive, so it would be http://bit.ly/WCGART
----------------------------------------thx. -- SekeRob P.S. Don't know why, but I did upload a fresh chart this morning... need to investigate. edit: No I did not as the DDDT2 A-Type with their high mean was blotting out all other curves... imminently a new one. :D edit2: It's been uploaded. See no particular acute hike through Aug.30 noon stats inclusive which compute at 8.95 hours. [Edit 2 times, last edit by Former Member at Aug 30, 2011 4:22:06 PM] |
||
|
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 3010 Status: Offline Project Badges:
|
This has been discussed in the Is the 12 Hr cut off the best limit? thread already - and, I do believe that a solution has already been thought through.
----------------------------------------As to whereabout's the testing/implementation of that solution is in terms of nearing the top of the WCG techs 'To do' list is, that's not something I know (although, it was said that there were 2-3 other things of a higher priority a few months back - so hopefully, something should be happening about it soon...). ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The last post to "Is the 12 Hr cut off the best limit?" was by SekeRob on July 4. I was waiting until October before I posted to that thread again to find out what is going on. It appears that the increase in run time should be more important with regard to being able to complete the entire WU.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As long as you are doing the first 4 or so sub-jobs, you are turning in useful data. Think of a line of slowly changing molecular shapes. We are plotting a curve along that line. The additional sub-jobs give us finer details, but we can estimate the missing details when the last few sub-jobs are missing for some jobs, as long as we do not have a long streak. And even 4 sub-jobs can validate a wing man's more detailed result.
Let the Clean Energy Project scientists worry about this. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It is no so much as returning useful data, we understand that anything after job 2 or 3 is useful, is wasting time, I also see the 12 hr cut as wasteful, my boxes normally run the full 12 hrs, with the last 1-4 hrs tossed because I did not complete the job I was on.
instead of a 12 hr hard stop, can we get a stop after next check point, that way all time spent is useful, maybe at 10 hrs instead of 12? |
||
|
|
anhhai
Veteran Cruncher Joined: Mar 22, 2005 Post Count: 839 Status: Offline Project Badges:
|
honestly the most important thing they can do is change how they verify CEP2. The scientist has already agree to make it zero redundancy, instead of the current method which requires 2 crunchers to crunch the same WU. If this is done, this will result in an almost a doubling of our output.
----------------------------------------But to address the current problem that must of you are complaining about, there were 2 things that the scientist wanted to implement. One was an option to crunch until the WU is done, the second was to make sure the WU doesn't start another step if it has only so much time til the time limit is up ![]() |
||
|
|
Jack007
Master Cruncher CANADA Joined: Feb 25, 2005 Post Count: 1604 Status: Offline Project Badges:
|
Or,
----------------------------------------match up the slow comps with fast comps, and have the slow comp only do the first 4 steps and the fast comp complete it. Then the slow comp will verify the fast comps results and we have all the info and minimized time spent on the WU. (yeah i'm sure that's impractical, but hey since we're dreaming here). For Ex, my 5 year old laptop (1.8 ghz single core) used to get some of the CEP2 WU done now they are always cut off at 12 hours (I'm cool with that, it's doin it's thing) The I7 2600K does em in 4 to 6 hours mostly. If the laptop could stop at step 4 it would prob verify 3 times as many (guessing here). ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As long as you are doing the first 4 or so sub-jobs, you are turning in useful data. Think of a line of slowly changing molecular shapes. We are plotting a curve along that line. The additional sub-jobs give us finer details, but we can estimate the missing details when the last few sub-jobs are missing for some jobs, as long as we do not have a long streak. And even 4 sub-jobs can validate a wing man's more detailed result. Let the Clean Energy Project scientists worry about this. Lawrence If the most useful data is returned by the first four jobs, why bother with the other 12 jobs? There must be useful information obtained by running a larger set of jobs for CEP2 or it would not have been designed that way. Until I can see an option to go beyond 12 hours of run time, I am off CEP2 and devoting all of my resources to Drug Search for Leishmaniasis since it is expected to be completed in about one year. ![]() |
||
|
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 3010 Status: Offline Project Badges:
|
Until I can see an option to go beyond 12 hours of run time, I am off CEP2 and devoting all of my resources to Drug Search for Leishmaniasis since it is expected to be completed in about one year. dkt, I'm with you on that - as, not only would I like to get my Emerald with DSfL before it's finished, I'd like to get my second Sapphire badge with HFCC - and, as that's due to finish first, that's where my primary focus is right at the moment. Then, there's always my C4CW & HPF2 Emerald badges to get before I return to CEP2 - which, hopefully, by then, will allow processing beyond the 12hr time limit. ![]() |
||
|
|
|