Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 30
Posts: 30   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3123 times and has 29 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
to dump or not to dump "redundant work unit/s ?

at times i see that a quorum of 3 results are returned before one of my slower machines has a chance to report, then i feel i am wasting time cos already validated, so sometimes i have then "dumped" the work unit , lose points but get to start anotherr hopefully usefull work unit. wot do others think (?) perhaps not for newbies but maybe after your 11st fre hundred thousand points u might like to accelerate project raher than just collect points ??

wot do others think? cheers
[Apr 24, 2006 12:47:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

According to the BOINC WIKI there are 3 Classes of Crunchers . Personally, i think it depends on where it is in relation to the determined profile you have biggrin . For myself, newby on BOINC, i've been checking a bit around to optimise for the day to day environment. Now that i'm happy, i'll let it do what it does best: work in silence, and check back with the total WCG project and team performance once in a while.

For those that use their BOINC also to process at Rosetta@home, WCG does that too, called Protein folding, much more efficient on BOINC plus you can contribute to a single team which will get the points both for the HPF and FAAH WU's

ciao

The former #1 Mukka Pazzo (for DPC)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Apr 24, 2006 1:44:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

at times i see that a quorum of 3 results are returned before one of my slower machines has a chance to report, then i feel i am wasting time cos already validated, so sometimes i have then "dumped" the work unit


Not sure but wouldn't the WU you dump be sent out to somebody else? If so then the net result is that 2 crunchers waste time on it rather than just 1 cruncher.

--
[Apr 24, 2006 2:19:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

I believe that Viktors keeps track of our error rates of various types. I do not know just how many permutations we have gone through, but by November 2005 we were using the Quorum of 4 method but sending out 5 copies of each work unit. More recently, we have switched to the Quorum of 3 method, and are sending out 4 copies of each work unit. I doubt that we shall ever use less than 3 copies to validate our results. Assuming that we never have to send additional copies (admittedly an unrealistic assumption) switching from 5 copies to 4 copies sped up our throughput by 25%. If we could stop sending out the fourth copy, it would speed up our throughput by 33%. But that additional copy really speeds up the validation process, which speeds up the feedback of results to the project scientists. So there is a lot to take into account when deciding what is best.

I believe that very few people [really, very very few] would haunt the results page to try to spot a newly validated work unit still in their work queue in order to dump it and draw a new work unit. That sort of thing is normally done in close-coupled computer clusters, not in loosely coupled computer clusters like the WCG.

Lawrence
[Apr 24, 2006 2:48:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

Not sure but wouldn't the WU you dump be sent out to somebody else? If so then the net result is that 2 crunchers waste time on it rather than just 1 cruncher.
--

not at all! i would not want to dump any w/u if it has a chance of being usefull, ie if 3 are returned and stated as valid - only then would i consider dumping in favour of new work
[Apr 24, 2006 11:12:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?



If we could stop sending out the fourth copy, it would speed up our throughput by 33%.
But that additional copy really speeds up the validation process, which speeds up the feedback of results to the project scientists. So there is a lot to take into account when deciding what is best.

I believe that very few people [really, very very few] would haunt the results page to try to spot a newly validated work unit still in their work queue in order to dump it and draw a new work unit. That sort of thing is normally done in close-coupled computer clusters, not in loosely coupled computer clusters like the WCG.

Lawrence


indded lawrence, thanx for comment,
1) for a long term project like this i think that a tiny wait for "the 4th w/u" that might be redundant mgiht be worth the wait, this might spped up things in the long term.
2) even the most fanaticicaly keen dumper could ot stay awake 24/7 and check every few minutes waiting for results to caome in, this sort of thing would be better manager at a server level if at all.
it does not have to be an "all or none approach?" ie could just a few 4th wu's be held back as exercise, say 10% of them then see wot the percentage gain may or may not be..
generaly i prefer to leave the wcg agents alone to quietly run in the background, i am just enjoying playing around with the features in boinc for interest sake,, plus am interested if users have any ideas to help move things along
cheers..
[Apr 24, 2006 11:24:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

If the error rate can be brought low enough, then I would love to see no extra copies sent out. But errors aren't the only reason results don't come back. An ideal scheduling system would need detailed, constant feedback from all the agents. Some hosts are reliable, others are not. Some are quick, others are slow.

Take me, for example. I've had a few intermittent connectivity problems lately, so my connect to server time is longer than average. This means that nearly every work unit I get is a fourth work unit - and I'm not going to dump them all!

I need to give this some thought. Scheduling problems are typically NP hard. Either we can perfect the scheduler so that barely any spare units need sending, or we can work out a way of giving "partial credit" and cancelling work units. Or both.

As an example, a perfect scheduler could send out three copies in such a way that the predicted end time is the same for each host. Then, if something goes wrong, extra copies can be sent to the fastest available machines, which are going to process the unit immediately. It may even be possible to preempt a host, and move the work unit to the head of the queue.

Untrusted (new) hosts could be given a fourth copy to see how it copes.

This is a fascinating topic. I wonder what work has already been done in the field? I know scheduling is a big research area. The grid parameters are probably different to most other applications, though. We would have to start by running some simulations. Real world error rates and host data would help, too.

As for the actual solution - if we limit ourselves to a smallish subset of hosts, or a group of representative hosts, then the scheduling load may be small enough to be practical. Some kind of neural net may help in adapting the scheduler to changing conditions.
[Apr 24, 2006 11:59:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18667
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

Wow. Interesting discussion. However, from a practical perspective, if you send a WU out to 4 machines and get 3 valid results back, how do you reliably "take back" the WU from the 4th machine? You can't count on being able to connect to that host at that point in time. If you can't count on that, I can't see any way to keep that 4th machine from starting the WU. In fact, it may have already started in and be almost done. If you force the host to abort the WU, even if you give partial credit, haven't we still wasted that portion of time on that machine? Yes, wasting 7.5 hours is better than wasting 8 but would you end up actually saving a significant amount of time? If you use a group of "trusted" machines for that 4th machine that you can connect to at anytime and "take back" that WU, how do you define that group in a way that is not biased, real or perceived? It's a good idea but is there a way to implement it that's practical and reliable?
----------------------------------------
Join/Website/IMODB



[Apr 25, 2006 1:02:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

... If you can't count on that, I can't see any way to keep that 4th machine from starting the WU. In fact, it may have already started in and be almost done. If you force the host to abort the WU, even if you give partial credit, haven't we still wasted that portion of time on that machine? Yes, wasting 7.5 hours is better than wasting 8 but would you end up actually saving a significant amount of time?...
quote]

well if you dont send out a 4th wu then you dont have to take it back , so your 7.5 hrs could go on a new wu :) , might delay "points" but in long term speed up rsults (?) of course is not always simple and i spose they have thought of this but is always worth questioning and if possible improving efficiency, note the change from quorum 4 to quorum 3 change.. :)
[Apr 25, 2006 1:40:00 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: to dump or not to dump "redundant work unit/s ?

Wow. Interesting discussion. However, from a practical perspective, if you send a WU out to 4 machines and get 3 valid results back, how do you reliably "take back" the WU from the 4th machine? You can't count on being able to connect to that host at that point in time. If you can't count on that, I can't see any way to keep that 4th machine from starting the WU. In fact, it may have already started in and be almost done. If you force the host to abort the WU, even if you give partial credit, haven't we still wasted that portion of time on that machine? Yes, wasting 7.5 hours is better than wasting 8 but would you end up actually saving a significant amount of time? If you use a group of "trusted" machines for that 4th machine that you can connect to at anytime and "take back" that WU, how do you define that group in a way that is not biased, real or perceived? It's a good idea but is there a way to implement it that's practical and reliable?


The project can be set to an initial replication of 3, ie quorum and initial replication are the same. 3 wu's are sent out if all are returned and validated then no further copies are sent out. The fourth or subsequent wu's are sent out if a problem arises.

Providing the error rate is less than 25% of wu's issued then the projects throughput will increase by reducing the initial replication. (Note error rate includes wu's that fail to download or be returned)
[Apr 25, 2006 3:44:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 30   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread