Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 10
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5146 times and has 9 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
cool HPF update from NYU / ISB



a few updates and answers to some questions...

1. We have transfered the code for HPF2 to the IBMers (Viktors and co.) and they have started to work on gridifing this code. There was a question on the forum recently about which proteins will go on the grid. The answer is elsewhere as well, but in short: The second phase (HPF2) would take important proteins with interesting novel predictions from HPF1 and refine those predicted structures (with something we call all-atom mode) to a higher level of resolution/accuracy (HPF1 == fold resolution, broad fold-function survey; HPF2 == higher res for more detailed conclusions). I’ll talk more about this next month.
We will focus in this second stage on malaria and human proteins that are predicted to be secreted proteins. These proteins are central to our understanding of Host-Pathogen interaction. We will also go after cancer biomarkers (proteins that are over-expressed in tissues or tumors of cancer patients) identified as part of several ongoing studies at the ISB.

we are 100% done with the initial aims of the project.

2. So why are we still crunching? :: We’re continuing to crunch proteins from a set of proteins known as Swiss-Prot ... these are proteins from many parts of the tree of life. In general these proteins are well collated and represent a third phase of HPF 1 where we are filling in the gaps left by our initial folding of 90 complete genomes. In any case we’re making the database more comprehensive with every new folding run ... covering more of the vast sequence space that is nature (or which we’ve sequenced a tiny fraction)

3. Lars and Mike are working had on the paper. The paper is at a stage where we have a draft. We’ve run a few additional quality benchmarks that have been positive, and we’re finalizing all of the details that will go into the website.

4. The database will be opened up when we submit the 1st paper.



ps. check out my new website: devilish
http://www.cs.nyu.edu/~bonneau/

as always, thanks for crunching...
more soon,

Richard Bonneau

----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 17, 2006 9:06:02 PM]
[Jan 17, 2006 9:04:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HPF update from NYU / ISB

ps. check out my new website: devilish
http://www.cs.nyu.edu/~bonneau/


Cool biggrin

Just one question from me, the science bit is way over my head, but you mentioned that we will be looking at certain proteins at a much higher resolution. Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks?
[Jan 17, 2006 10:47:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HPF update from NYU / ISB

Looking for an excuse to get that Opty.............haha..........just buy the thing biggrin
[Jan 17, 2006 11:24:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
confused Re: HPF update from NYU / ISB

Hi Dr. Bonneau,
We’re continuing to crunch proteins from a set of proteins known as Swiss-Prot ... these are proteins from many parts of the tree of life. In general these proteins are well collated and represent a third phase of HPF 1 where we are filling in the gaps left by our initial folding of 90 complete genomes.

1) Is Swiss-Prot a set of proteins from the ?90? genomes that we have crunched which were omitted in the first pass? Or are they derived from other genomes which we have not crunched?

2) Have we really predicted protein folds from a total of 90 genomes, including the human genome as one?

3) We tried to fold all the unknown proteins expressed in the human genome that we could. Did we follow a similar strategy for the other genomes, or did we just work on subsets for these genomes?

Thanks for the update! tongue
mycrofth
[Jan 18, 2006 12:25:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
cool Re: HPF update from NYU / ISB

Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks?

Nope, we'll just break each protein into 10X # of chuncks (y'all tend to like the chuncks to be less than one day in processoring)
[Jan 18, 2006 1:23:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
cool Re: HPF update from NYU / ISB


1) Is Swiss-Prot a set of proteins from the ?90? genomes that we have crunched which were omitted in the first pass? Or are they derived from other genomes which we have not crunched?

2) Have we really predicted protein folds from a total of 90 genomes, including the human genome as one?

3) We tried to fold all the unknown proteins expressed in the human genome that we could. Did we follow a similar strategy for the other genomes, or did we just work on subsets for these genomes?

Thanks for the update! tongue
mycrofth


1) they are from all complete genomes PLUS all the proteins from incomplete sequencing (ie. one protein at a time type stuff). Short answer: the complete genomes are a subset of the proteins contained in swiss-prot.

2) yes, we've gone through a lot of sequences this last year and we've folded over 150,000 protein domains covering a lot of these genomes (methods other than protein folding cover a lot of these genomes as well amd the proteins we havn't folded have function annotated another way, through sequence matches to other proteins)

3) human went through first and we paid careful attention to fold lots of versions of the predicted human proteins... but when we finished human the remaining ~90 genomes got the same treatment as well ... so equivilent treatment for the set of complete genomes.
----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 18, 2006 1:30:21 AM]
[Jan 18, 2006 1:28:57 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HPF update from NYU / ISB

Searching the web, here is what I find for Swiss-Prot:

The Swiss-Prot database of proteins: http://www.ebi.ac.uk/swissprot/
# Swiss-Prot was created by Amos Bairoch in 1986 at the Department of Medical Biochemistry of the University of Geneva and has been a collaborative effort of the Department and the European Molecular Biology Laboratory (EMBL) since 1987. UniProtKB/Swiss-Prot is now an equal partnership between the EMBL and the Swiss Institute of Bioinformatics (SIB), to which Bairoch's team is also affiliated, with Bairoch retaining ultimate responsibility for the scientific content and format of Swiss-Prot. The EMBL activities are carried out by its Hinxton Outstation, the European Bioinformatics Institute (EBI).
# UniProtKB/Swiss-Prot is a curated, added-value database, not a repository of primary information.
# UniProtKB/Swiss-Prot's curation team adds detailed annotation and organisation to protein sequences, the overwhelming majority of which come from translations from the public nucleotide sequence databases. The value of UniProtKB/Swiss-Prot to the academic and commercial research community is widely accepted. It is the gold standard for scientific databases and must be rendered secure.
# The UniProtKB/Swiss-Prot team draws heavily on the support of experts throughout the world, and the contribution of those experts is appreciated and will always be acknowledged.


Searching the web for 'secreted protein', here is what I find:

secreted proteins: Encoded (usually) by genes with signal sequences, and such proteins include potential therapeutic proteins such as hormones, cytokines, and growth factors. [CHI Functional Genomics report]

(So the insulin I take is a secreted protein.)

Secreted Protein Database: http://spd.cbi.pku.edu.cn/

Lecture 15 [2000] Secreted and Membrane Proteins: http://www.lclark.edu/~reiness/cellbio/lectures/lect15.htm

Diagram - Synthesis of Secreted Proteins: http://www.people.virginia.edu/~rjh9u/secrprotsyn.html

And here is an example of a pharmaceutical company (Inpharmatica) that specializes in secreted proteins: http://www.inpharmatica.com/secreted.htm
[Jan 18, 2006 5:32:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
depriens
Senior Cruncher
The Netherlands
Joined: Jul 29, 2005
Post Count: 350
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HPF update from NYU / ISB

Thanks for the update!
Looking forward to the HPF2 project... cool
----------------------------------------

[Jan 20, 2006 11:20:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
retsof
Former Community Advisor
USA
Joined: Jul 31, 2005
Post Count: 6824
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HPF update from NYU / ISB

Will this mean that machines for this particular phase of the project will need to be faster/ more powerful or will the IBM'ers break the data down into more manageble chunks?

Nope, we'll just break each protein into 10X # of chuncks (y'all tend to like the chuncks to be less than one day in processoring)
Well, yes. My slow machine can work on this, as the other project, FightAids@Home, is limited to 550MHz machines and faster. On the other hand, the HPF chunks have been especially nasty lately.

Pentium III/500 MHz, processor score 50

02/14/2006 0:008:08:43:21 3,303 1

8 days, 8 hours, 43 minutes, 21 seconds (over 200 hours)
3,303 points
1 result

The next workunit is 3% complete after 10 hours.

I think the timeout is still set to two weeks, which could be a concern for slow computers and giant workunits.
----------------------------------------
SUPPORT ADVISOR
Work+GPU i7 8700 12threads
School i7 4770 8threads
Default+GPU Ryzen 7 3700X 16threads
Ryzen 7 3800X 16 threads
Ryzen 9 3900X 24threads
Home i7 3540M 4threads50%
----------------------------------------
[Edit 2 times, last edit by retsof at Feb 14, 2006 7:41:16 PM]
[Feb 14, 2006 7:39:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HPF update from NYU / ISB

I had a protien run on a athlon64 3000+ which took 30 hours. This machine generally does at least 4 wu's a day.
[Feb 14, 2006 11:21:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread