Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Support Forum: Suggestions / Feedback Thread: Determining a new machine is reliable |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 3
|
Author |
|
jussr
Cruncher Joined: Oct 16, 2017 Post Count: 6 Status: Offline Project Badges: |
Something I've been thinking about, though let me start by saying that I am admittedly making a number of assumptions here. If any or all of them are incorrect and render my suggestion moot, just ignore me!
So I've been running BOINC on one of my household computers for a couple weeks now, and I've noticed that the WUs I get for one project are all quorum/replication 1, assuming no previous errors. It also gets given WUs that have shorter deadlines, so if I'm not mistaken, that suggests computer #1 has been classed as reliable. Last night I installed BOINC on a second computer, and for the same project, the WUs it gets are all quorum/replication 2. I'm assuming that the point of this is to verify that computer #2 is going to return reliable results before it starts being allowed to run through WUs solo, but if I'm right, I think it could be done more efficiently. Right now, computer #2 is waiting on whatever other systems out there were given the same WUs to check, and I think in the meantime all the new WUs it's issued are still being duplicated. And there's no way to determine how long it's going to be before those other systems return their own results and computer #2 passes its... probationary period, if you like. Wouldn't it be possible, when someone wants to start using a new machine, for the system to check it against one of their old ones that's already considered trusted? If I could say that I wanted computer #1 to be given the duplicates of the WUs computer #2 is starting out with, I could guarantee that they'll be finished without delay. And the sooner computer #2 has enough results verified = the sooner its WUs can stop being duplicated = less wasted work across the system, no? |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
The verification security is in the random assignment. If it were known that copy 1 goes member's machine A and copy 2 to same member's machine B, doors are opened to do ugly things.
Yes it's a long known fact that for instance needing 10-20 valid for a science before reliable kicks in, needs many more, before the wingman have replied to any of the first N returned If you have an issue with it, try rotation, where several sciences are already are 'always' single' like FAH2 and MIP1. Allow 20 to come of ZIKA, then 20 of OET1, then 20 of SCC1 and so on... crunch them and switch to the 'always quorum 1' sciences. When all have validated, open the floodgates. Mind you, if you get into 8-12-16-32 threads in a machine, the strategy is pretty short lived. Maybe WCG could teach the scheduler to only feed the minimum to attain reliable for new machines, but what if the member only wants to compute SCC1? No easy solution and just live with it and think that other machines equally need verification i.e. real smart scheduling, only send quorum 2 to new machines only (:> Don't forget, there's random re-verification, quorum 2. Not sure how many are involved in this, 1-few. |
||
|
jussr
Cruncher Joined: Oct 16, 2017 Post Count: 6 Status: Offline Project Badges: |
The verification security is in the random assignment. If it were known that copy 1 goes member's machine A and copy 2 to same member's machine B, doors are opened to do ugly things. That is a good, if depressing, point, and one I had not considered. |
||
|
|