Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Locked
Total posts in this thread: 79
Posts: 79   Pages: 8   [ 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 10865 times and has 78 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Work Unit Pause: A and B type update

Hi everyone,

So, as you all know, we are temporarily out of work units. The CEP team is working hard to get new good quality work units generated so that everyone can keep working towards CEP badges and new solar cell materials!

Here is where we stand:
A types: We are doing science. That means that some things don't work the way we thought they would ... but that means that we have a chance to learn from the errors that were returned. We are generating more A type work units, but we are also trying to minimize the chance that we will see the errors that have caused problems in the past. To do this, we have improved our "in-house" screening to try to get a better sense of which jobs will run smoothly and we are processing pre-work unit data this very minute. Again, this is only a pause - we have tens of thousands of molecules to test!

B types: These jobs are a continuation of successful A type work units. We need more statistical data to analyze each computational job, so that is the role of the B type jobs. We are focusing on the work units that were completed correctly, so the jobs that caused serious problems will not be included in this round. To make sure things are running properly, these will be sent out in beta soon, so keep your eyes open!

We are also working on a timing algorithm so that the WCG techs can more accurately estimate the time to completion.

We really are working hard on getting all the kinks ironed out of this HUGE project - thanks for sticking with us!!

The CEP Team
[Feb 23, 2009 6:20:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hunterkasy
Senior Cruncher
USA
Joined: Dec 8, 2008
Post Count: 300
Status: Offline
Project Badges:
Re: Work Unit Pause: A and B type update

thanks for the update, I am sure we all will be waiting to start crunching again
[Feb 23, 2009 6:39:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Work Unit Pause: A and B type update

Thank you very much!
[Feb 23, 2009 7:52:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Re: Work Unit Pause: A and B type update

Thanks for the update, and good luck in nailing down and solving the problems.
I think that in parallel to you guys working on the science, someone should be working to prevent the slowdown in computer speed that happens on multi-core computers when more than 1 instance of CEP (CHARMM) is running simultaneously.
During the recent CEP run, many people micromanaged their computers to ensure that CEP WUs ran 1 at a time. For example, CA Sekerob said on Feb 6, "I only permit them to run 1 by 1, which is not the idea, but they run that way about 20% faster in any combination of jobs on the other cores." This was acceptable at the start of your so-far-troublesome project, but will not be so in the future, especially when DDDT Phase 2 comes on line (What's the latest news from Galveston?- watowich, 31/1/09), as DDDT Phase 2 is also projected to use CHARMM (http://www.utmb.edu/discoveringdenguedrugs%2Dtogether/Technical%20Details.htm).
I suspect that the slowdown is associated with the huge numbers of memory page-faults generated throughout the program runs. In the Beta Test forum, I conjectured a reason for these page-faults Re: CEP is experiencing a huge PF Delta rate. I did not know about the multi-core performance degradation at the time, but this degradation may be because the operating systems handle memory allocation in a single thread, which bottlenecks when it gets too many requests from the multiple cores. And as I suggested then, these myriad calls to the OS probably degrade the performance of the program when it's running on only 1 core, too.

I think that this matter deserves some priority. To quote Stan Watowich, head researcher of DDDT, "I suspect that the grid will need to grow even more to return our Phase 2 calculations as fast as we would like ... we will see shortly ...". (Link above)

Perhaps the maintainers of CHARM at Harvard can improve their software. In the current economic climate, there must be plenty of voluntary or paid spare programmer-hours available somewhere. The solution may be as simple as diverting a couple of functions that call directly on the operating system to allocate/deallocate memory, to other functions that use a buffered scheme like malloc() and free() in C. Public-domain source code should be available for such schemes, eg from UseNet, alloc.c . - HTH
PS: And it took me quite a while to dig out the links and the source code.
----------------------------------------
[Edit 6 times, last edit by Rickjb at Feb 24, 2009 4:08:29 AM]
[Feb 23, 2009 8:47:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
rilian
Veteran Cruncher
Ukraine - we rule!
Joined: Jun 17, 2007
Post Count: 1452
Status: Offline
Project Badges:
applause Re: Work Unit Pause: A and B type update

Thank you for update info! It is always appreciated!
----------------------------------------
[Feb 23, 2009 8:48:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Work Unit Pause: A and B type update

Thanks for the update! applause
[Feb 23, 2009 9:00:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Re: Work Unit Pause: A and B type update

Thanks for the update guys..
and Rickjb, did you devote a half hour just to make that one post and find all the nice little quotes? it really seems rather unnecessary.
"Never increase, beyond what is necessary, the number of words required to explain anything"
William of Ockham (1285-1349)
[Feb 23, 2009 9:31:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
smile Re: Work Unit Pause: A and B type update

Personally, think Rickjb did a great job in pulling facts together and providing pathways to other information and preemptively reducing many more posts.
It's the concept of "Meeting and Exceeding the customers expectations".

Personally I dislike half liners that do not display the querier actually took effort to find an answer. Those are bound to receive curt answers or much delayed or none.

Thanks, Rickjb.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 23, 2009 9:41:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Re: Work Unit Pause: A and B type update

And warning in case you've not found out: I'm very verbose and do not subscribe to Ockham's approaches. I hide the answers in essays wink
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Feb 23, 2009 9:43:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Omega-3
Cruncher
Joined: Dec 7, 2008
Post Count: 20
Status: Offline
Re: Work Unit Pause: A and B type update

Thanks for the great update! It is much appreciated. The amount of info you provide in every update always helps us understand what is going on "behind the scenes"! love struck

I am very patient with this project. I am sticking around to see the end of this project's goals! cool
[Feb 23, 2009 10:33:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 79   Pages: 8   [ 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread