Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 107
Posts: 107   Pages: 11   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1446546 times and has 106 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Dear Aaron and nasher,
You are absolutely right: eventually we want to complete wus on the grid, and more importantly we also want to use the output from the current wus as input for different kinds of wus ('set2', 'set3', etc). There are a few technical issues like bookkeeping and minimizing network traffic that need to be figured out. The incomplete wus essentially have to be refactored. We have a basic version of this which allows us to restart jobs in-house, but haven't finished the implementation for the grid. But that'€™s ok, there are still more important things to sort out at this point. It is actually not necessary to have ALL results for ALL molecules, i.e., a 'complete' data set. This study has a strongly hierarchical structure anyways, and the more interesting a molecule is, the more calcs we do on it.

Dear jeffop and others,
Our perspective on this matter is really very simple - as long as CEP2 runs on your computer and ideally gets through the first 3 jobs, you are helping us. It€™'s as easy as that. Our wus are designed to be useful whether you get through all jobs or only through part of them. So if you ask us, we€™'d suggest to not worry too much about it. BTW: Credit is given for the run time, independent of whether a wu was finished or not.

Best wishes from

Your Harvard CEP team

P.S. The user option for the 12h time-out extension is currently under development by our friends from IBM.
[Feb 10, 2011 4:10:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 3010
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

P.S. The user option for the 12h time-out extension is currently under development by our friends from IBM.

That's excellent news - thanks to everyone concerned cool
----------------------------------------

[Feb 10, 2011 4:21:42 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jack007
Master Cruncher
CANADA
Joined: Feb 25, 2005
Post Count: 1604
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

my old laptop (win xp 1.8 ghz 2 gigs ram) is not going to finish a
CEP2, I get that totally makes sense, it will run CEP2 til project finishes,
or it dies.
My I7 930 OC to 3.2 ghz, with 6 GIGS RAM (at 1600 MHZ) OC to 3.2 ghz, WIN 7 64 bit, has on occasion hit the 12 hour cutoff. I find that
rather bizarre, it's not a slow computer (yes it's HT for 8 cores) but
still would think that it would finish a WU... I'm going to go look at
results and see how many that has happened to.

EDIT: Ok that's only one that I can see, a couple others were 10 to 11+ hours
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by Jack007 at Feb 13, 2011 12:04:39 PM]
[Feb 13, 2011 12:02:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

P.S. The user option for the 12h time-out extension is currently under development by our friends from IBM.

That's excellent news - thanks to everyone concerned cool

Wonder how the approach on this will be if opted in to run in full, just with the normal 10x current average FPOPS header runtime cutoff:

1. ..., wingman assignment will have to do too?
2. ..., pair with wingman that does at least the first 3 critical jobs to get a reasonable validation?
3. ..., stroke of luck wingman pairing continues.

--//--
----------------------------------------
[Edit 2 times, last edit by Former Member at Feb 13, 2011 12:11:45 PM]
[Feb 13, 2011 12:09:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Wonder how the approach on this will be if opted in to run in full, just with the normal 10x current average FPOPS header runtime cutoff:

The current cutoff is 20x for CEP2.
Also, while the estimated run-time includes DCF, the DCF is not used when it comes to the cutoff.

So, if example DCF is 2 and the CEP2-estimate is 12 hours, the cutoff will be 12 hours / 2 * 20 = 120 hours or 5 days.
If DCF is 1.5 and CEP2-estimate is 12 hours, the cutoff will be 12 hours / 1.5 * 20 = 160 hours or 6.67 days.
If DCF is 1.0 and CEP2-estimate is 12 hour, the cutoff will be 240 hours or 10 days.

So the cutoff shouldn't give any problems. The even larger variations in DCF likely to happen for users choosing to run past 12-hour-limit can give some problems with scheduling.
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Feb 13, 2011 1:04:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

The current cutoff is 20x for CEP2.
Also, while the estimated run-time includes DCF, the DCF is not used when it comes to the cutoff.

So, if example DCF is 2 and the CEP2-estimate is 12 hours, the cutoff will be 12 hours / 2 * 20 = 120 hours or 5 days.
If DCF is 1.5 and CEP2-estimate is 12 hours, the cutoff will be 12 hours / 1.5 * 20 = 160 hours or 6.67 days.
If DCF is 1.0 and CEP2-estimate is 12 hour, the cutoff will be 240 hours or 10 days.

So the cutoff shouldn't give any problems. The even larger variations in DCF likely to happen for users choosing to run past 12-hour-limit can give some problems with scheduling.
Rather counter intuitive for the DCF increase to lead to shorter runtime permitted.
[Feb 13, 2011 1:32:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Would it not be better to allow tasks to continue to run, on slow machines, only if it is known that the results are worth having?
My take is that after the first 3 jobs complete there is enough of an indication on how important running the next tasks would be.
If this is the case then it might make sense to do this for all tasks regardless of CPU Performance; if by 3 jobs you can safely say there is no point running the rest, then there is little point in doing so, even for fast CPUs; it’s just a waste of CPU time.
[Feb 13, 2011 1:48:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Rather counter intuitive for the DCF increase to lead to shorter runtime permitted.

Yes, but the limits was present long before DCF was introduced, so any estimates and therefore any cut-off was only based on the benchmark.

While DCF will give more or less "correct" estimated cpu-time, it was probably decided to not also adjust the cut-off-limit. Some projects has initially grossly over-estimated the run-time, something that can give DCF less than 0.01, and then the estimates is adjusted, a cut-off including DCF of 0.01 would mean all tasks with new estimate will error-out.
Similarly, some projects can have tasks terminating early (example LHC@home), and with a string of such tasks in a row the DCF can also become very low and this would give the same result with all later work being terminated...

So, more accurately the table would be:
1: Benchmark says 6 hours cpu-time, meaning 120 hours cut-off or 5 days.
2: Benchmark says 8 hours cpu-time, meaning 160 hours cut-off or 6.67 days.
3: Benchmark says 12 hours cpu-time, meaning 240 hours cut-off or 10 days.

That WCG's estimates is so far off that for #1 the DCF is 2.0 and for #2 the DCF is 1.5 don't have any effects on the cut-off.

WCG being off with more than 2 isn't so uncommon, a quick look reveals had a CMD2 earlier today that bumped DCF to 2.499, while a CEP2 was at 0.601...
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Feb 13, 2011 4:56:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

P.S. The user option for the 12h time-out extension is currently under development by our friends from IBM.

Are there any estimates as to when this is to be completed?
[Feb 13, 2011 6:17:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Hi Jack007,
Two comments: 1) HT is not the same as a physical core and is for various reasons much less efficient; 2) If you run 8 CEP2 wus simultaneously, your hd may be the bottleneck (another indication for this is if your wall-clock and cpu time are very different).
We recommend reducing the number of CEP2 wus to maybe 4 in the profile and fill up the rest with another project which is less demanding.

Hi SekeRob,
The wingman will not be affected by this - validation will be performed on an early stage. In addition we hope to get rid of the wingman-validation in 2-3 months completely.

Hi skgiven,
It is not quite that easy. After job 3 we have a first impression of a system but we cannot say anything ‘safely’. But the more data we have on it, the more solid this impression gets. The wus are also designed to give the later jobs in a very efficient manner (it’s called a projection or boot-strapping scheme). But there is a trade-off of more detailed results on fewer molecules vs less detailed ones on more molecules. In the grand scheme of things it does not make a big difference for us. There are pros and cons for having longer vs shorter runs and they balance out in the end as long as we have a decent mix. The way we designed our wus gives them a wide window in which different contributions are useful. We want to give people the choice and hope that motivates them to donate more time to CEP2 – that’s what really counts.

Hi dkt,
We don't want to make promises we cannot keep, but I expect that we talk about weeks rather than months.

Best wishes to you all

Your Harvard CEP team
[Feb 14, 2011 7:43:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 107   Pages: 11   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread