Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Cure Muscular Dystrophy - Phase 2 Forum Thread: Return time of a HCMD2 WU half as normal? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 40
|
Author |
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
About the Project at http://www.worldcommunitygrid.org/projects_showcase/hcmd2/viewHcmd2About.do says: For the first 168 selected proteins docked in Phase 1, the CPU time on World Community Grid was of about 8,000 years. With the 2,246 proteins of Phase 2, the estimated time is 11.46x8000 = 91,680 years on World Community Grid. So figure about 92 K CPU years. But faster CPUs [2009 rather than 2007] will cut that a bit. In fact you can read there: For the first 168 selected proteins docked in Phase 1, the CPU time on World Community Grid was of about 8,000 years. With the 2,246 proteins of Phase 2, the estimated time is 11.46x8000 = 91,680 years on World Community Grid. A solution to this computational barrier is to use evolutionary information to predict potential binding sites and realize localized docking on surfaces which are most likely to interact. This preliminary analysis based on protein evolution highly reduces computational time by a factor of 100 and therefore allow us to extend the analysis at large scale with the crucial help of World Community Grid. To me this looks more like: If we use the same technology like Phase 1 we would need 92K CPU years. The new technology divides this by 100. So, only about 917 CPU years??? Which is the correct estimate? Greetings Thorsten As posted by uplinger, it's just way too early to make any estimate. Once there are some complete batches in, an extrapolation can be made. Where you get a divisor of 100 I don't know. My reckoning is current tech about 70,000 CPU core years or about 2 years on the grid, in share time with the other WCG projects... but as said, way too early. Don't know what the real redundancy and error rate was on phase I, so that adds to the question mark.
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
tluehr said
----------------------------------------For the first 168 selected proteins docked in Phase 1, the CPU time on World Community Grid was of about 8,000 years. With the 2,246 proteins of Phase 2, the estimated time is 11.46x8000 = 91,680 years on World Community Grid. A solution to this computational barrier is to use evolutionary information to predict potential binding sites and realize localized docking on surfaces which are most likely to interact. This preliminary analysis based on protein evolution highly reduces computational time by a factor of 100 and therefore allow us to extend the analysis at large scale with the crucial help of World Community Grid. To me this looks more like: If we use the same technology like Phase 1 we would need 92K CPU years. The new technology divides this by 100. So, only about 917 CPU years??? Which is the correct estimate? Reply from lawrencehardin: This technique can be used to speed up proteins that are annotated with appropriate evolutionary information. I have no idea how many proteins that is. I doubt that it is enough to cut the required computations by even a half. That is just a personal guess on my part. We do not have the percentage of proteins that are well known. Considering that the Protein Data Bank only has structures for about 5 % of human proteins, I feel sure that the percentage is nowhere near the 100% required for the 917 CPU-year estimate. Lawrence Added: Gemineo has pointed out that using JET it has been possible to cut the required computations to only 15% !! So maybe 10,000 CPU years? [Edit 1 times, last edit by Former Member at May 12, 2009 6:53:32 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
What you did was probably picked up a work unit that was errored out quickly. What does the results page show for the work units with only a 5 day return period? One of my machines errored out quickly on a few work units with "process got signal 4". This machine is very sick and I have taken it off the grid until I can get some replacement parts for it. Let me know if this is what you are seeing. -Uplinger Hummm... yep, you're right... Was it maybe CMD2_ 0001-3BYH_ A.clustersOccur-3BYH_ A.clustersOccur_ 27_ 1-- ?? That's solved then. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I doubt that it is enough to cut the required computations by even a half. That is just a personal guess on my part. Using JET, we actually achieved to reduce the "docking space" from 913,627,781,945 configurations to 137,652,178,995 (that is about 15% from the initial configurations). Alessandra published this link in her update back in February. If you follow it, you will learn more about the process. |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Okay, so the sum is phase 2 proteins 2,246 / 168 = 13.37 - 15% = 11.36 * 8,000 CPU years = 91,680 per with 2007 CPU population.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
gemineo said that the computations were reduced to 15% of the original, not to 85%. I am guessing about 10,000 CPU years. We will have better guesses after a few weeks.
Lawrence |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
How odd that 2246 / 168 is 13.37 and not 11.46.
----------------------------------------Added: The linked site towards the end expand on the numbers: HCMD2 will study cross docking of 2246 human protein structures. In total we shall dock 2 466 753 pairs of proteins among the 2246^2 possible ones. This means that 913 627 781 945 docking initial positions should be computed by a full cross-docking. By using JET, we can reduce the docking space of 87%, that is we shall have only 118 771 611 652 conformations to be analyzed. Phase 1 of the project studied 28 224 pairs (168 proteins) and explored a total of 10 391 124 240 conformations, that is 11,43 times less than what we shall have to compute in Phase II (in terms of docking initial positions).
WCG Global & Research > Make Proposal Help: Start Here!
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at May 12, 2009 9:18:52 PM] |
||
|
jkislenko
Advanced Cruncher Czech Republic Joined: Mar 29, 2007 Post Count: 62 Status: Offline Project Badges: |
I got WUs with deadline 26th. They seem pretty small. Just little over 2 hours for me.
|
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
So far after about 8, longest 1:24 and shortest 0:41 hours. Good to pump up result statistics, but this is just temporary. WCG pursues currently overall per-project job averages of 7 hours.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
How odd that 2246 / 168 is 13.37 and not 11.46. To compute a correct ratio you should consider the number of conformations, not the number of proteins: 137652178995 / 10391124240 = 13.25 Added: The linked site towards the end expand on the numbers: This page has been written in January/February whereas the exact final number of conformations has been established in mid-March. I will ask Alessandra to update it with the correct numbers. Thank you for the remark, Sekerob. |
||
|
|