Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 107
Posts: 107   Pages: 11   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1446570 times and has 106 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Dear gb077492,

I think running more than one CEP2 instance on your single core P4 with 1GB RAM is just stretching the hardware limits. CEP2 already uses up most of the physical CPU power, so there is not much room for HT, and each wu uses up 512MB of mem, so you are likely already swapping... But you should be perfectly fine running one instance at a time.

Best, your

Harvard CEP team
[Jan 11, 2011 4:56:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

There has been some misunderstanding on the soft timeout, there is no soft timeout in the current release. At one point there was disucssion of having such a feature but this was not implemented. Sorry for the misunderstanding. We have started discussing adding some sort of a soft timeout again. Stay tuned.

Thanks,
armstrdj

Thanks for the information.
----------------------------------------


[Jan 11, 2011 5:56:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gb077492
Advanced Cruncher
Joined: Dec 24, 2004
Post Count: 96
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

I think running more than one CEP2 instance on your single core P4 with 1GB RAM is just stretching the hardware limits.


I can understand why you would say that, but I have a 10K rpm disk drive and the VM size of each process seems to stay well below 500K. My wife can use Seamonkey to do e-mail and access the web quite happily, with only the odd stutter at WU start-up time, even with 2 CEP2 tasks running.

If it wasn't running without issue I wouldn't be running it. biggrin

Mike.
[Jan 12, 2011 12:26:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

What I would suggest is that the soft time-out be fixed, but that it should also depend on getting through enough of the steps for the result to be useful on its own. From what I read I think that means not until step 8 is complete. Personally I would be happy to extend to the hard time-out to 15 hours or even more. I have observed some WUs where step 2 has run really long, even beyond the 10 hour mark, and it seems silly to kill a WU that soon.
Mike


I agree with the above, and hope it's doable.

I have two faster machines that always finish all 16 jobs (unless they hit the dreaded RC 100), and two slower machines that are always stopped at 12 hrs. I would love for the slower machines to always be given enough time to return a useful result, even if that took much longer than 12 hrs.

Kate
----------------------------------------

[Jan 12, 2011 1:59:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
nasher
Veteran Cruncher
USA
Joined: Dec 2, 2005
Post Count: 1423
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

personaly i wouldnt mind results running longer than 12 hours... to be honest i think a 1 day time out would be better for me but i also understand if you think the extra results wont be worth the extra time.

i am curious though ...
1) if both runs are on slower computers and they dont get though step 12 or so is it still good info
2) if it isnt any larger result file is there any real reason to end at 12 hours ?

i am sure i can think of more but i got to go to work now
----------------------------------------

[Jan 13, 2011 2:50:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Hi everybody,

thanks for the feedback on the soft-timeout and the timeout-extension as a user option. We had a discussion with our friends at IBM yesterday and these topics are on the near-future agenda biggrin.

Dear nasher,

1) Yes, it is really fine to have incomplete results. We have about 10 million candidates in mind right now and not all of them will be subject to all calcs anyways. There will be holes in the big result table - no matter what. The most promising ones (which we can identify from a few jobs already) will be finished up in-house, if necessary.

2) Well, the reason really is that many average users don't like to crunch on a workunit for too long, and that is understandable. We know that the enthusiasts don't mind that and prefer to run longer - that's why we try to get it to work as a user option.

Best wishes from

Your Harvard CEP team

P.S. @Mike: Sorry, in that case I really don't know what's going on. Sweet hd by the way!
[Jan 13, 2011 5:44:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
anhhai
Veteran Cruncher
Joined: Mar 22, 2005
Post Count: 839
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

10 million candidates = 10 million unique WU (right?) = 20 million WU that we crunch.

WU already crunched = 1,751,532
ave WU crunched per day = 8,757
WUs crunched yesterday = 14,308

Assuming that we will average what we did yesterday, the following will tell us how long this project will last.
(20 million WU - 1,751,532 )/14,308 = 1275 days (or 3.5 yrs)

I now see why they are trying to increase the participation rate.

If they have the soft-timeout, that should increase the number of WUs done per day.
However, if they have a time-out extension then it will decrease the number of WUs done per day.

Basically what I am saying is that we need to increase the number of crunchers for this project so we can get this done faster.
----------------------------------------

[Jan 13, 2011 6:25:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

2) Well, the reason really is that many average users don't like to crunch on a workunit for too long, and that is understandable. We know that the enthusiasts don't mind that and prefer to run longer - that's why we try to get it to work as a user option
As an enthusiast, what I want is just to crunch the work and return the results. Unlike some, I really do not want to participate in the software engineering as to whether I want 8, 12 or 16 hour WUs. If the scientists want to cutoff at 12 hours then fine. If you remember, HCC ran for many years, and without hardly a question from the crunchers, the scientists changed the software and reduced the average WU time. Seems they didn't need our advice. Huh! But thanks for asking if I cared. cool

Shorter increases the number of WU, and may increase the overall data volume. Longer increases the volume of repair WU, which take longer to crunch and the risk to the slower (and off-line) machines that may not be able to finish in time.

This whole extended discussion on WU length (CEP2, HCMD2) is divisive, between the have and have-nots (as measured by machine Gflops). Its fine to cater to the fast machines, but do not abandon the user base.
[Jan 13, 2011 6:51:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

Dear anhhai,
yes, we'll not run out of work prematurely ;). The good news is that in the course of the projects we learn more and more about the candidates which will help us to focus and prioritize our search. So we probably will not do all of the molecules we have on the book right now, but we'll likely add others as we go along. The scope of CEP2 really is in flow and responds to what we learn.

Dear astrolab,
it's all perfectly fine - people who don't want to be bothered with details can stick to the 12h default, others will hopefully have an option to extend it it as they like. Repair wus are actually not an issue because the validation is performed early in the workunit, so it is no problem if one wingman gets further than the other.

Thanks for crunching and best wishes from your

Harvard CEP team
[Jan 13, 2011 7:23:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 3010
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Is the 12 Hr cut off the best limit?

BTW, just to add to this post, I'm currently running a HFCC job which looks as though it'll way surpass any lengthy CEP2 job I've had - as it's currently at 60% after 12 Hrs, with another estimated 7 1/2 hrs to go... Normally, on my machine, HFCC jobs average out at 7.38 Hrs (so this is certainly going to have an severe adverse affect on my DCF rating). Although this is the exception, it's by no means the only exception I've had...

At least if the CEP2 limit was 'upped', we'd know what the limit would be...

Edit: Well, my long HFCC WU took 19:36:50 hrs in CPU time (20:38:44 hrs elapsed), and as expected, threw my computers DCF rating completely off crying Oh well, it'll correct itself again in time...
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by gb009761 at Jan 16, 2011 1:40:59 AM]
[Jan 15, 2011 4:08:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 107   Pages: 11   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread