World Community Grid - View Thread - A 3 part Question on BOINC?

World Community Grid Forums

Category: Support

Forum: BOINC Agent Support

Thread: A 3 part Question on BOINC?

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 34

[ ]

Author

This topic has been viewed 3345 times and has 33 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A 3 part Question on BOINC?

-- why not just (as WCG has started to realize the possibility of) send priority WUs out to those machines which are fastest and most likely to return them reliably?

They most certainly should do that. The question is... How can they do it cheap and easy? The problem is... Who is FnR and who isn't? So you were FnR yesterday but are you today when I need you? Do I still get 100% of your CPU time or have I been bumped down to 40%? Do you still cache only 1 WU or have you decided to cache 5 now?

At this time I have visions (but no certain knowledge) of the WCG sysops leafing through reams of logs looking for hosts who are FnR (catch that one nelsoc), worst case scenario. Hopefully and most likely they have a database query or some bit of code that can identify FnR hosts and feed that info to the scheduler/dispatcher, all automated. Another possibility is that they have negotiated agreements with a large number of hosts who have high benchmarking CPUs, crunch only for WCG and have agreed to not cache WUs. All of the above scenarios plus others I can think of seem to require a lot of human intervention/labor or cannot be carried from one project to another easily.

Now imagine a revision to the BOINC code, server side as well as client. On the server side, there is a flag for high priority WUs. On the client side, the host operator can specify whether or not s/he will allow preemption by high priority WUs and/or which projects may preempt and which may not. Then, when a host requests work from a project, it can tell that project it accepts preemptive WUs from that project. If the server has no hi priority WUs at that time then no problem, business as usual. If it has hi priority WUs the server can refuse the host on the grounds that the host isn't fast or reliable enough. The host can refuse if the WU will break a deadline.

So that's a general sketch of what I'm thinking. Is it worth the effort? Would it be only a geeky-cool feature, an interesting exercise in coding but in the end totally useless? Well, I dunno. It depends how far the various projects have to stretch their human and machine resources. It depends on how big the grid grows and how many WUs fail to validate. Personally, I feel it could get humongous in short order and project sysops will want to have as many automated hands free systems as they can in place BEFORE big growth hits.

Why bother with priority WUs at all? Well, I may be wrong but it seems to me all unvalidated results must occupy storage space as well as pointers/indices in lists until quorum is achieved. Under the old system, if the first 3 results returned did not produce quorum then it was not long until the 4th result came. All that has changed now and we have seen evidence of the effects already, namely that the backlog of unvalidated results has increased significantly. Again, the issue is not the points or how long the points willl be delayed. The issue is how big will the backlog grow when the number of members grows dramatically and how much more will have to be spent to just handle the backlog. If a good priority system were brought online now it will alleviate storage problems in the future.

Anther reason for a priority system is that at some projects the WUs sent tomorrow are built on knowledge gleaned from results validated today.

[May 28, 2006 4:36:06 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: A 3 part Question on BOINC?

Somewhere it was established that there is a contract between processing and science requiring a quorum of 3, so its a given. But, would the large delays with the 3 submission rule, not be aliviated somewhat if a test is done upon 2 returned to establish if they are a close match? If not, copy 4 and 5 could be send out, i figure, many hours earlier. Certainly, i've seen a large increase in "inconclusives". some have sat there for a week.

As for the technical part of prioritisation, if there could be just a statistical flag that would be visible in the BOINC client and there would be a function in BOINC WU listing tab to shift an item up or down, it would get its priority for the project on an eclectic basis. The projected alotted time would supersede i.e. if 50% is assigned to Rosetta@home, it would still get it......

i thought it to be extreme, that the program interrupted a WU when time was up and switches to a different project, but thats an aside

As for UD agent awarding points. I presume there is still a quality test before awarding points? Some seemingly perceive the points are an "irrespective" thing long as the WU is returned 100% crunched.

Have a nice day raised eyebrow

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[May 28, 2006 6:26:46 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A 3 part Question on BOINC?

As for the technical part of prioritisation, if there could be just a statistical flag that would be visible in the BOINC client and there would be a function in BOINC WU listing tab to shift an item up or down

Why would any programmer want to waste their time building such a totally useless and foolish feature into BOINC? So that a few nut cases can diddle around with WU priority under the illusion that they are doing something useful or even sane for themselves or for their team, namely earning more points? Excuse me for being rude but I think you've lost your grip on reality and I think you need to be told.

----------------------------------------
[Edit 1 times, last edit by Former Member at May 28, 2006 7:22:39 AM]

[May 28, 2006 7:13:35 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A 3 part Question on BOINC?

Hi Sekerob,

As for UD agent awarding points. I presume there is still a quality test before awarding points? Some seemingly perceive the points are an "irrespective" thing long as the WU is returned 100% crunched.

As far as I know, the UD software awards points for any completed result that is not listed as an error (within time limits). The validation step comes afterwards. If there are *too many* invalid results, the WCG sends an email message warning the member to check the computer. The BOINC Results Status page is a big help that way.

About other improvements / additions to BOINC - BOINC is designed to create a very loosely coupled cluster of computers, suitable for a voluntary public grid. GLOBUS is used to create more closely coupled clusters suitable for organizations. I doubt that BOINC will develop that way for a while.

Lawrence

[May 28, 2006 8:49:36 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A 3 part Question on BOINC?

I think you've hit a good point concerning backlog.

As far as doing things "cheaply and easily" goes, I think it'll nearly always be found that it is cheaper and easier to implement a change server-side than client-side, given the number of nodes on the system.

Not that its cheap and easy, just cheaper and easier.

As far as determining which machines are fast and reliable, you need a very simple model (very simple to keep calculating complexity down) that's updated each time the client contacts the server. Automating it is about the only way to go. Trust me, if you make it a system that requires manual fiddling -- even one where you have a certain number of machines that are dedicated to a singlar project -- that'd require someone change that server-list every time the machine-list changes. (It's unrealistic to think it won't. Mere maintenance will make this necessary. Let alone what happens when people start figuring in to the equation here and there.) A good automated system only requires tweaking when the policies used to make decisions change (in this case policies relating to WU prioritization and assignment). [And for that matter, a well-designed policy or set of policies also requires very little human intervention (read: tweaking).]

This isn't to say that human-less systems are the best thing since sliced bread. It's just to say that when dealing with vast quantities of data, I find it's always better to build something capable of letting me stand back and watch it go, freeing me up to manage it rather than push it along.

Your worst-case scenario is, accordingly, my nightmare. I've lived through minor variations. I am not a happy camper at the end of those days.

--random9q

[May 28, 2006 8:43:29 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A 3 part Question on BOINC?

A priority flag system was allready proposed and vetoed for BOINC, in connection with the folding@home project. It was vetoed in part because of the potential for projects fighting as mentioned in a previous post. However the main reason was that most of what it was wanted for can be accomplished by setting tight deadlines.

If I am not mistaken the LHC project has shorter deadlines on reissued work than the same work initially has.

[May 29, 2006 1:32:09 AM]

olympic
Senior Cruncher
Joined: Jun 12, 2005
Post Count: 156
Status: Offline


Re: A 3 part Question on BOINC?

I think the new 3 result system is working "just about" perfect. My validation queue increased by 40% but it has leveled off and no WU's are being left behind. I've also noticed that my 2 computers tend to receive quite a few of the 4th results to crunch since they are are both very fast, only run WCG, and connected through broadband. So WCG's ability to detect FnR crunchers seems to work OK.

The 96 hour rule may need to be tweaked, I've noticed a few 4th results are sent out after 4 days only for one of the original 3 to be returned shortly afterwards. But this is still more efficient than sending out 4 right off the bat. The Grid can't be slowed to a crawl waiting the full 3 weeks for results from the handful of people out there who are running 10 projects on a Pentium 166 and conncected through dial-up. But we don't want anyone to feel like they are wasting their time crunching with an older machine either. It's a real balancing act to say the least.

----------------------------------------

[May 29, 2006 4:28:13 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: A 3 part Question on BOINC?

Thanks Keck_Komputers, your reply on the BOINC flagging is informative, opposed to Dagorath's blaze.

Have a Nice Day

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[May 29, 2006 5:31:40 AM]

knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding

90 day badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Computing for Clean Water

14 day badge for Uncovering Genome Mysteries

45 day badge for Outsmart Ebola Together

180 day badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

1 year badge for Africa Rainfall Project

180 day badge for OpenPandemics - COVID-19


Re: A 3 part Question on BOINC?

If a computers 'recent credit' is greater then 70 per cpu and it has had a result validated within the last 24 hours then it is deemed a reliable computer.

About 11% of the computers using BOINC with World Community Grid meet this requirement.

[May 29, 2006 6:25:22 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: A 3 part Question on BOINC?

Thanks Keck_Komputers, your reply on the BOINC flagging is informative, opposed to Dagorath's blaze.

Sekerob, I apologise for flaming you. It's just that I think manipulating the processing order of the WUs in the cache is... well... nuts. However, if that's what you like doing then go for it. The important thing is that you contribute just like the rest of us. In the future, I will bite my tongue and remember that I do a few nutty things too.

[May 29, 2006 8:22:22 AM]

[ ]