World Community Grid - View Thread - HPF update from NYU / ISB

World Community Grid Forums

Category: Completed Research

Forum: Human Proteome Folding

Thread: HPF update from NYU / ISB

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 10

[ ]

Author

This topic has been viewed 5146 times and has 9 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


HPF update from NYU / ISB

a few updates and answers to some questions...

1. We have transfered the code for HPF2 to the IBMers (Viktors and co.) and they have started to work on gridifing this code. There was a question on the forum recently about which proteins will go on the grid. The answer is elsewhere as well, but in short: The second phase (HPF2) would take important proteins with interesting novel predictions from HPF1 and refine those predicted structures (with something we call all-atom mode) to a higher level of resolution/accuracy (HPF1 == fold resolution, broad fold-function survey; HPF2 == higher res for more detailed conclusions). I’ll talk more about this next month.
We will focus in this second stage on malaria and human proteins that are predicted to be secreted proteins. These proteins are central to our understanding of Host-Pathogen interaction. We will also go after cancer biomarkers (proteins that are over-expressed in tissues or tumors of cancer patients) identified as part of several ongoing studies at the ISB.
 
we are 100% done with the initial aims of the project.

2. So why are we still crunching? :: We’re continuing to crunch proteins from a set of proteins known as Swiss-Prot ... these are proteins from many parts of the tree of life. In general these proteins are well collated and represent a third phase of HPF 1 where we are filling in the gaps left by our initial folding of 90 complete genomes. In any case we’re making the database more comprehensive with every new folding run ... covering more of the vast sequence space that is nature (or which we’ve sequenced a tiny fraction)

3. Lars and Mike are working had on the paper. The paper is at a stage where we have a draft. We’ve run a few additional quality benchmarks that have been positive, and we’re finalizing all of the details that will go into the website.

4. The database will be opened up when we submit the 1st paper.

ps. check out my new website: devilish

http://www.cs.nyu.edu/~bonneau/

as always, thanks for crunching...
more soon,

Richard Bonneau

----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 17, 2006 9:06:02 PM]

[Jan 17, 2006 9:04:47 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

ps. check out my new website: devilish

http://www.cs.nyu.edu/~bonneau/

Cool

Just one question from me, the science bit is way over my head, but you mentioned that we will be looking at certain proteins at a much higher resolution. Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks?

[Jan 17, 2006 10:47:10 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

Looking for an excuse to get that Opty.............haha..........just buy the thing biggrin

[Jan 17, 2006 11:24:43 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

Hi Dr. Bonneau,

We’re continuing to crunch proteins from a set of proteins known as Swiss-Prot ... these are proteins from many parts of the tree of life. In general these proteins are well collated and represent a third phase of HPF 1 where we are filling in the gaps left by our initial folding of 90 complete genomes.

1) Is Swiss-Prot a set of proteins from the ?90? genomes that we have crunched which were omitted in the first pass? Or are they derived from other genomes which we have not crunched?

2) Have we really predicted protein folds from a total of 90 genomes, including the human genome as one?

3) We tried to fold all the unknown proteins expressed in the human genome that we could. Did we follow a similar strategy for the other genomes, or did we just work on subsets for these genomes?

Thanks for the update! tongue

mycrofth

[Jan 18, 2006 12:25:50 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks?

Nope, we'll just break each protein into 10X # of chuncks (y'all tend to like the chuncks to be less than one day in processoring)

[Jan 18, 2006 1:23:24 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

mycrofth

1) they are from all complete genomes PLUS all the proteins from incomplete sequencing (ie. one protein at a time type stuff). Short answer: the complete genomes are a subset of the proteins contained in swiss-prot.

2) yes, we've gone through a lot of sequences this last year and we've folded over 150,000 protein domains covering a lot of these genomes (methods other than protein folding cover a lot of these genomes as well amd the proteins we havn't folded have function annotated another way, through sequence matches to other proteins)

3) human went through first and we paid careful attention to fold lots of versions of the predicted human proteins... but when we finished human the remaining ~90 genomes got the same treatment as well ... so equivilent treatment for the set of complete genomes.

----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 18, 2006 1:30:21 AM]

[Jan 18, 2006 1:28:57 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

Searching the web, here is what I find for Swiss-Prot:

The Swiss-Prot database of proteins: http://www.ebi.ac.uk/swissprot/

# Swiss-Prot was created by Amos Bairoch in 1986 at the Department of Medical Biochemistry of the University of Geneva and has been a collaborative effort of the Department and the European Molecular Biology Laboratory (EMBL) since 1987. UniProtKB/Swiss-Prot is now an equal partnership between the EMBL and the Swiss Institute of Bioinformatics (SIB), to which Bairoch's team is also affiliated, with Bairoch retaining ultimate responsibility for the scientific content and format of Swiss-Prot. The EMBL activities are carried out by its Hinxton Outstation, the European Bioinformatics Institute (EBI).
# UniProtKB/Swiss-Prot is a curated, added-value database, not a repository of primary information.
# UniProtKB/Swiss-Prot's curation team adds detailed annotation and organisation to protein sequences, the overwhelming majority of which come from translations from the public nucleotide sequence databases. The value of UniProtKB/Swiss-Prot to the academic and commercial research community is widely accepted. It is the gold standard for scientific databases and must be rendered secure.
# The UniProtKB/Swiss-Prot team draws heavily on the support of experts throughout the world, and the contribution of those experts is appreciated and will always be acknowledged.

Searching the web for 'secreted protein', here is what I find:

secreted proteins: Encoded (usually) by genes with signal sequences, and such proteins include potential therapeutic proteins such as hormones, cytokines, and growth factors. [CHI Functional Genomics report]

(So the insulin I take is a secreted protein.)

Secreted Protein Database: http://spd.cbi.pku.edu.cn/

Lecture 15 [2000] Secreted and Membrane Proteins: http://www.lclark.edu/~reiness/cellbio/lectures/lect15.htm

Diagram - Synthesis of Secreted Proteins: http://www.people.virginia.edu/~rjh9u/secrprotsyn.html

And here is an example of a pharmaceutical company (Inpharmatica) that specializes in secreted proteins: http://www.inpharmatica.com/secreted.htm

[Jan 18, 2006 5:32:31 AM]

depriens
Senior Cruncher
The Netherlands
Joined: Jul 29, 2005
Post Count: 350
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding - Phase 2

2 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

1 year badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

1 year badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

1 year badge for Africa Rainfall Project

2 year badge for OpenPandemics - COVID-19


Re: HPF update from NYU / ISB

Thanks for the update!
Looking forward to the HPF2 project... cool

----------------------------------------

[Jan 20, 2006 11:20:14 AM]

retsof
Former Community Advisor
USA
Joined: Jul 31, 2005
Post Count: 6824
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

1 year badge for Nutritious Rice for the World

180 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

14 day badge for Discovering Dengue Drugs - Together - Phase 2

1 year badge for Drug Search for Leishmaniasis

14 day badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: HPF update from NYU / ISB

Well, yes. My slow machine can work on this, as the other project, FightAids@Home, is limited to 550MHz machines and faster. On the other hand, the HPF chunks have been especially nasty lately.

Pentium III/500 MHz, processor score 50

02/14/2006 0:008:08:43:21 3,303 1

8 days, 8 hours, 43 minutes, 21 seconds (over 200 hours)
3,303 points
1 result

The next workunit is 3% complete after 10 hours.

I think the timeout is still set to two weeks, which could be a concern for slow computers and giant workunits.

----------------------------------------

SUPPORT ADVISOR
Work+GPU i7 8700 12threads
School i7 4770 8threads
Default+GPU Ryzen 7 3700X 16threads
Ryzen 7 3800X 16 threads
Ryzen 9 3900X 24threads
Home i7 3540M 4threads50%

----------------------------------------
[Edit 2 times, last edit by retsof at Feb 14, 2006 7:41:16 PM]

[Feb 14, 2006 7:39:51 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: HPF update from NYU / ISB

I had a protien run on a athlon64 3000+ which took 30 hours. This machine generally does at least 4 wu's a day.

[Feb 14, 2006 11:21:02 PM]

[ ]