| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 16
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've explained the reasoning in
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1392#13753 Basically when using Rosetta for Ab Initio structure prediction, we're looking for a needle in a haystack. Probably 99% of the runs will produce models (predicted protein structures) which have higher energies and are thus discarded. The few runs producing low energy models, can be verified in-house. Using such high quorum and redundancy values for HPF, is not justified IMHO. PS: Not to mention that WCG didn't compress either for downloads or uploads when run via BOINC. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
You failed to make your point there, too :-)
Any error could potentially be important. The fact that most of the results are irrelevant doesn't change that. Good science is worth infinitely more than bad science. Let's do this thing properly. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
You failed to make your point there, too :-) Any error could potentially be important. The fact that most of the results are irrelevant doesn't change that. I guess everyone is entitled to an opinion, but the fact that R@H works with replication=1, quorum=1, tells another story. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
one of the basic tenets of science is the experiment has to be "reproduceable" so repeating the experiment and getting the same results is very important. i doubt that less than 3 work units will ever happen, wot if u ony do 2, how do you know which one is best, because we are using computers from all over the world with different setups, some of which may be dodgy then it is reaonable to insist on multiple jobs. if it was beig done on an "inhouse: supercomputer where inhouse tech experts monitor everything then maybe the quorum method would be unecessary. IMHO i tend to agree with you but the scince has to be sound , had to be trusted by scientists and by nature they are difficult to convince ( which is a good thing) ..if we can do this and get fabulus results with qorum three then who knows maybe there will be some other way to validate results.. one small error could be a disaster, wot if somwe dodgy computer failed in a job and missed "the only possible cure for disease x etc" ?? all the millions of years of crunching wasted because of stuffed up result :(
i know i started this thread and am keen for improvements in efficiency also but they have already gone from quorum 4 to qorum 3 which is huge improvement in efficiency but then only after proving wot a good job the crunchers were doing , keep up the good work cheers |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Rosetta@Home have a completely different problem to HPF. One is developing the software, the other is using it. I understand R@H do a LOT of validation by re-running units using their Robetta server. That isn't a practical option for HPF. The goal here is to avoid using large amounts of supercomputer time.
If the error rate was negligible, we could go to a quorum of 2. The fact we haven't done this is a clear indication that the error rate isn't negligible. Every time you see a work unit staus of "inconclusive", you know someone has returned an invalid result. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello dhatz,
guess everyone is entitled to an opinion, but the fact that R@H works with replication=1, quorum=1, tells another story. I thought about mentioning Rosetta@home, but decided that would be unnecessary. I was wrong. So here goes . . . Rosetta@home is testing new algorithms on proteins of known structure. Their main interest is getting a predicted structure as close to the known structure as possible. So they run with no duplication, then pick the best prediction and verify it by rerunning it on their computer. So they are no more trusting than anybody else. Every result that they care about is validated this way - - 100% duplication. We are folding proteins without a known structure, so we have to validate every prediction, not just the best prediction. We cannot rerun everything on known good computers. Right now we need 3 identical results for validation. In theory, we could accept 2 identical results, sending out another 2 copies only if the first 2 disagreed. (This would effectively be the same as Rosetta@home, except that we would care about every result, not just the best result.) But there is no way we can get below 100% duplication. I have no knowledge of just what our error rates are, but I am glad that we have dropped from 4 to 3, since I have always considered 3 to be a 'gold standard' for validation. Lawrence |
||
|
|
|