World Community Grid - View Thread - Determining a new machine is reliable

World Community Grid Forums

Category: Support

Forum: Suggestions / Feedback

Thread: Determining a new machine is reliable

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 3

[ ]

Author

This topic has been viewed 1833 times and has 2 replies

jussr
Cruncher
Joined: Oct 16, 2017
Post Count: 6
Status: Offline
Project Badges:

180 day badge for Mapping Cancer Markers

1 year badge for Outsmart Ebola Together

45 day badge for FightAIDS@Home - Phase 2

90 day badge for Microbiome Immunity Project


Determining a new machine is reliable

Something I've been thinking about, though let me start by saying that I am admittedly making a number of assumptions here. If any or all of them are incorrect and render my suggestion moot, just ignore me!

So I've been running BOINC on one of my household computers for a couple weeks now, and I've noticed that the WUs I get for one project are all quorum/replication 1, assuming no previous errors. It also gets given WUs that have shorter deadlines, so if I'm not mistaken, that suggests computer #1 has been classed as reliable. Last night I installed BOINC on a second computer, and for the same project, the WUs it gets are all quorum/replication 2.

I'm assuming that the point of this is to verify that computer #2 is going to return reliable results before it starts being allowed to run through WUs solo, but if I'm right, I think it could be done more efficiently. Right now, computer #2 is waiting on whatever other systems out there were given the same WUs to check, and I think in the meantime all the new WUs it's issued are still being duplicated. And there's no way to determine how long it's going to be before those other systems return their own results and computer #2 passes its... probationary period, if you like.

Wouldn't it be possible, when someone wants to start using a new machine, for the system to check it against one of their old ones that's already considered trusted? If I could say that I wanted computer #1 to be given the duplicates of the WUs computer #2 is starting out with, I could guarantee that they'll be finished without delay. And the sooner computer #2 has enough results verified = the sooner its WUs can stop being duplicated = less wasted work across the system, no?

[Nov 1, 2017 12:17:51 PM]

SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline


Re: Determining a new machine is reliable

The verification security is in the random assignment. If it were known that copy 1 goes member's machine A and copy 2 to same member's machine B, doors are opened to do ugly things.

Yes it's a long known fact that for instance needing 10-20 valid for a science before reliable kicks in, needs many more, before the wingman have replied to any of the first N returned If you have an issue with it, try rotation, where several sciences are already are 'always' single' like FAH2 and MIP1. Allow 20 to come of ZIKA, then 20 of OET1, then 20 of SCC1 and so on... crunch them and switch to the 'always quorum 1' sciences. When all have validated, open the floodgates.

Mind you, if you get into 8-12-16-32 threads in a machine, the strategy is pretty short lived. Maybe WCG could teach the scheduler to only feed the minimum to attain reliable for new machines, but what if the member only wants to compute SCC1? No easy solution and just live with it and think that other machines equally need verification i.e. real smart scheduling, only send quorum 2 to new machines only (:>

Don't forget, there's random re-verification, quorum 2. Not sure how many are involved in this, 1-few.

[Nov 1, 2017 3:33:45 PM]

jussr
Cruncher
Joined: Oct 16, 2017
Post Count: 6
Status: Offline
Project Badges:


Re: Determining a new machine is reliable

The verification security is in the random assignment. If it were known that copy 1 goes member's machine A and copy 2 to same member's machine B, doors are opened to do ugly things.

That is a good, if depressing, point, and one I had not considered.

[Nov 2, 2017 6:27:48 PM]

[ ]