Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 5
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2065 times and has 4 replies Next Thread
trongnguyen_82
Cruncher
Joined: Aug 10, 2006
Post Count: 10
Status: Offline
Reply to this Post  Reply with Quote 
Predict and hypothetical protein

I just have two curious questions:

1. So far, many comparisions of GC project included a predicted protein/hypothetical protein or both. Because they are only "predicted/hypothetical", they're not real. Is there a high chance that we're wasting our crunching time on some totally-different-from-real (useless) protein?

2. Just an estimation, how many proteins do we know so far (protein that is not in the predict/hypothetical class), compare with the total proteins of Mother Nature?

Thank you.
----------------------------------------

[Dec 11, 2006 9:48:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Predict and hypothetical protein

Hopefully one of the project scientists can give a better answer.

However, from my understanding of this, Genome Comparison is using information from the genomes of different species, rather than just known proteins. The way proteins are expressed on the genome is not as simple as the DNA mechanism might make it appear. I understand this will let us learn more about many proteins that haven't been studied in detail yet, and make the annotation database complete in this respect.
[Dec 11, 2006 10:31:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Predict and hypothetical protein

Dear Colleagues:

Only about 5% of the proteins were actually studied experimentally. In a typical bacterial genome, about 30% of the proteins are either hypothetical, putative, probable, similar to, etc. The terminology used depends on the way the annotations were added to the genome. These 30% normally have no attributed function whatsoever.

But this means that about 65% of the proteins have had their functions inferred from previously annotated proteins. There lies the basic principle behind the comparison of biological sequences: the more similar two sequences are (either nucleotidic or protein), the more probable it is that they share the same function. Obviously, many biological factors can complicate this, which is one of the main points of the GC project, to check and develop criteria for the definition of protein families and reannotation.

As for the other 30%, the fact that they are hypothetical or putative does not mean that they are not real, although this can happen. Indeed, it is possible that a putative protein has a counterpart in another genome(s). This means that, although no information is available about that particular protein, it is very much likely that it s a real protein, since it is present in more than one genome. In this case, we usually say that this is an hipotheticaly conserved protein. On the other hand, sometimes you can find an "orphan" protein, that is, a protein without any counterparts. In this case, this protein is usually tagged as an unknown protein.

There are several documented cases of organism-specific proteins. In fact, one can not rule out the real existence of a certain orphan protein based only in their presence or absence among different genomes. We have to remember that the genetic code forces the DNA sequence of a protein coding region to be structured in a certain way, and these biases with relation to a randomic sequence may be measured and quantified. So it is possible to have almost absolute certainty that an orphan protein is indeed a real gene, although there are no counterparts in other genomes and no attributed function.

Cheers,

Antonio
[Dec 11, 2006 4:51:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Predict and hypothetical protein

thanks, amazing stuff !
[Dec 12, 2006 6:57:45 AM]   Link   Report threatening or abusive post: please login first  Go to top 
trongnguyen_82
Cruncher
Joined: Aug 10, 2006
Post Count: 10
Status: Offline
Reply to this Post  Reply with Quote 
Re: Predict and hypothetical protein

Thank you for the reply.
----------------------------------------

[Dec 12, 2006 11:50:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread