| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Thread Type: Sticky Thread Total posts in this thread: 44
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks, we are excited to be part of the WCG.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello,
The database will be freely available, yes. The project compares predicted protein sequences, mostly from environmental metagenomic samples, contributing to annotation, and studies on metabolic pathways from micro-organisms. Comparing non-coding sequences (DNA) can be done within a restricted dataset, but has other purposes. The totality of known DNA sequences is now far too large for such overall comparisons. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello Crystal,
Simap has indeed the purpose of mapping protein similarities essentially from the public reference protein sequences and especially domains, resulting in a specific database/repository with very valuable research tools. Back in 2006/2007, the Genomecomparison project focussed on protein datasets from whole genomes, and the use of the rigorous ssearch algoritm for enhanced statistical confidence for inter-genome distance calculations and other applications, while Simap ran a much faster (directional hit detection) Blast implementation, in later years redoing the calculations similar to Genome Comparison. In the Uncovering Genome Mysteries project, we are concentrating much more on metagenomic sequences from environmental samples, containing mostly as yet unknown organisms. The project involves a very large dataset and aims at the discovery of new enzymatic functions, unusual metabolic pathways, and also pretends to shed more light on the ecological relationships and interactions between micro-organisms in specific niches. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi gb,
We´ll do our best to follow discussions and interact with the WCGrid contributors! |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1403 Status: Offline Project Badges:
|
Thanks Wim for your insight answer.
Since Sept/Oct 2009 SIMAP (we volunteers ) also calculated millions and milions of sequences from environmental genomes. You surely know or even met Prof. Thomas Rattei (now from Vienna University) and are aware of the treasures you may find in the SIMAP-database."Heel veel succes met jullie UGM ontdekkingsreis". CP |
||
|
|
Antonius_Block
Cruncher Joined: Sep 10, 2011 Post Count: 3 Status: Offline Project Badges:
|
I hope you won't be looking for a "cure" for autism. We get enough of that insulting condescension from the anti-vaxxers and dumbass neurotypicals who call it a disease.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
i'm pretty sure, there is no special intent to find a cure for autism, as there is nothing mentioned in this direction.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I hope you won't be looking for a "cure" for autism. We get enough of that insulting condescension from the anti-vaxxers and dumbass neurotypicals who call it a disease. Until now, you are the only one insulting. |
||
|
|
numbermaniac
Cruncher Australia Joined: Mar 28, 2014 Post Count: 46 Status: Offline Project Badges:
|
How many proteins are compared in each workunit? It seems to me about 18,000 but I'm just curious.
|
||
|
|
seippel
Former World Community Grid Tech Joined: Apr 16, 2009 Post Count: 392 Status: Offline Project Badges:
|
The answer is that the number of proteins in each work unit can vary widely. Each work unit consists of two file of proteins which are compared to each other (every protein in file A is compared to every protein in file B). Shorter proteins take less time to compare than longer proteins. The work unit generation program does some estimating to determine how proteins should be in each file to achieve the targetted runtime. So if the proteins being compared are short, it will compensate by adding more proteins to the file (and vice versa). From a quick sampling, the biggest number of proteins I saw in one of the two work unit files was 116k proteins, but this was just a sampling.
Seippel |
||
|
|
|