| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 10
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
a few updates and answers to some questions... 1. We have transfered the code for HPF2 to the IBMers (Viktors and co.) and they have started to work on gridifing this code. There was a question on the forum recently about which proteins will go on the grid. The answer is elsewhere as well, but in short: The second phase (HPF2) would take important proteins with interesting novel predictions from HPF1 and refine those predicted structures (with something we call all-atom mode) to a higher level of resolution/accuracy (HPF1 == fold resolution, broad fold-function survey; HPF2 == higher res for more detailed conclusions). I’ll talk more about this next month. We will focus in this second stage on malaria and human proteins that are predicted to be secreted proteins. These proteins are central to our understanding of Host-Pathogen interaction. We will also go after cancer biomarkers (proteins that are over-expressed in tissues or tumors of cancer patients) identified as part of several ongoing studies at the ISB. we are 100% done with the initial aims of the project. 2. So why are we still crunching? :: We’re continuing to crunch proteins from a set of proteins known as Swiss-Prot ... these are proteins from many parts of the tree of life. In general these proteins are well collated and represent a third phase of HPF 1 where we are filling in the gaps left by our initial folding of 90 complete genomes. In any case we’re making the database more comprehensive with every new folding run ... covering more of the vast sequence space that is nature (or which we’ve sequenced a tiny fraction) 3. Lars and Mike are working had on the paper. The paper is at a stage where we have a draft. We’ve run a few additional quality benchmarks that have been positive, and we’re finalizing all of the details that will go into the website. 4. The database will be opened up when we submit the 1st paper. ![]() ps. check out my new website: http://www.cs.nyu.edu/~bonneau/ as always, thanks for crunching... more soon, Richard Bonneau [Edit 1 times, last edit by Former Member at Jan 17, 2006 9:06:02 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Cool Just one question from me, the science bit is way over my head, but you mentioned that we will be looking at certain proteins at a much higher resolution. Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks? |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Looking for an excuse to get that Opty.............haha..........just buy the thing
![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Dr. Bonneau,
We’re continuing to crunch proteins from a set of proteins known as Swiss-Prot ... these are proteins from many parts of the tree of life. In general these proteins are well collated and represent a third phase of HPF 1 where we are filling in the gaps left by our initial folding of 90 complete genomes. 1) Is Swiss-Prot a set of proteins from the ?90? genomes that we have crunched which were omitted in the first pass? Or are they derived from other genomes which we have not crunched? 2) Have we really predicted protein folds from a total of 90 genomes, including the human genome as one? 3) We tried to fold all the unknown proteins expressed in the human genome that we could. Did we follow a similar strategy for the other genomes, or did we just work on subsets for these genomes? Thanks for the update! mycrofth |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks?
Nope, we'll just break each protein into 10X # of chuncks (y'all tend to like the chuncks to be less than one day in processoring) |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
1) Is Swiss-Prot a set of proteins from the ?90? genomes that we have crunched which were omitted in the first pass? Or are they derived from other genomes which we have not crunched? 2) Have we really predicted protein folds from a total of 90 genomes, including the human genome as one? 3) We tried to fold all the unknown proteins expressed in the human genome that we could. Did we follow a similar strategy for the other genomes, or did we just work on subsets for these genomes? Thanks for the update! mycrofth 1) they are from all complete genomes PLUS all the proteins from incomplete sequencing (ie. one protein at a time type stuff). Short answer: the complete genomes are a subset of the proteins contained in swiss-prot. 2) yes, we've gone through a lot of sequences this last year and we've folded over 150,000 protein domains covering a lot of these genomes (methods other than protein folding cover a lot of these genomes as well amd the proteins we havn't folded have function annotated another way, through sequence matches to other proteins) 3) human went through first and we paid careful attention to fold lots of versions of the predicted human proteins... but when we finished human the remaining ~90 genomes got the same treatment as well ... so equivilent treatment for the set of complete genomes. [Edit 1 times, last edit by Former Member at Jan 18, 2006 1:30:21 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Searching the web, here is what I find for Swiss-Prot:
The Swiss-Prot database of proteins: http://www.ebi.ac.uk/swissprot/ # Swiss-Prot was created by Amos Bairoch in 1986 at the Department of Medical Biochemistry of the University of Geneva and has been a collaborative effort of the Department and the European Molecular Biology Laboratory (EMBL) since 1987. UniProtKB/Swiss-Prot is now an equal partnership between the EMBL and the Swiss Institute of Bioinformatics (SIB), to which Bairoch's team is also affiliated, with Bairoch retaining ultimate responsibility for the scientific content and format of Swiss-Prot. The EMBL activities are carried out by its Hinxton Outstation, the European Bioinformatics Institute (EBI). # UniProtKB/Swiss-Prot is a curated, added-value database, not a repository of primary information. # UniProtKB/Swiss-Prot's curation team adds detailed annotation and organisation to protein sequences, the overwhelming majority of which come from translations from the public nucleotide sequence databases. The value of UniProtKB/Swiss-Prot to the academic and commercial research community is widely accepted. It is the gold standard for scientific databases and must be rendered secure. # The UniProtKB/Swiss-Prot team draws heavily on the support of experts throughout the world, and the contribution of those experts is appreciated and will always be acknowledged. Searching the web for 'secreted protein', here is what I find: secreted proteins: Encoded (usually) by genes with signal sequences, and such proteins include potential therapeutic proteins such as hormones, cytokines, and growth factors. [CHI Functional Genomics report] (So the insulin I take is a secreted protein.) Secreted Protein Database: http://spd.cbi.pku.edu.cn/ Lecture 15 [2000] Secreted and Membrane Proteins: http://www.lclark.edu/~reiness/cellbio/lectures/lect15.htm Diagram - Synthesis of Secreted Proteins: http://www.people.virginia.edu/~rjh9u/secrprotsyn.html And here is an example of a pharmaceutical company (Inpharmatica) that specializes in secreted proteins: http://www.inpharmatica.com/secreted.htm |
||
|
|
depriens
Senior Cruncher The Netherlands Joined: Jul 29, 2005 Post Count: 350 Status: Offline Project Badges:
|
Thanks for the update!
----------------------------------------Looking forward to the HPF2 project... ![]() ![]() |
||
|
|
retsof
Former Community Advisor USA Joined: Jul 31, 2005 Post Count: 6824 Status: Offline Project Badges:
|
Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks? Well, yes. My slow machine can work on this, as the other project, FightAids@Home, is limited to 550MHz machines and faster. On the other hand, the HPF chunks have been especially nasty lately.Nope, we'll just break each protein into 10X # of chuncks (y'all tend to like the chuncks to be less than one day in processoring) Pentium III/500 MHz, processor score 50 02/14/2006 0:008:08:43:21 3,303 1 8 days, 8 hours, 43 minutes, 21 seconds (over 200 hours) 3,303 points 1 result The next workunit is 3% complete after 10 hours. I think the timeout is still set to two weeks, which could be a concern for slow computers and giant workunits.
SUPPORT ADVISOR
----------------------------------------Work+GPU i7 8700 12threads School i7 4770 8threads Default+GPU Ryzen 7 3700X 16threads Ryzen 7 3800X 16 threads Ryzen 9 3900X 24threads Home i7 3540M 4threads50% [Edit 2 times, last edit by retsof at Feb 14, 2006 7:41:16 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I had a protien run on a athlon64 3000+ which took 30 hours. This machine generally does at least 4 wu's a day.
|
||
|
|
|