World Community Grid - View Thread - Parents, children, grandchildren WUs

World Community Grid Forums

Category: Completed Research

Forum: Help Cure Muscular Dystrophy - Phase 2 Forum

Thread: Parents, children, grandchildren WUs - how does it work?

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 76

[ ]

Author

This topic has been viewed 532335 times and has 75 replies

knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding

90 day badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Computing for Clean Water

14 day badge for Uncovering Genome Mysteries

45 day badge for Outsmart Ebola Together

180 day badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

1 year badge for Africa Rainfall Project

180 day badge for OpenPandemics - COVID-19


Re: Parents, children, grandchildren WUs - how does it work?

I was at the annual BOINC workshop last week and I had a chance to talk to David Anderson about the idea of sending workunits to computers of similar performance. We had some good ideas that we may be able to put into practice.

[Oct 26, 2009 9:57:54 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Parents, children, grandchildren WUs - how does it work?

That I presume is part of plan to soon be able to size work in a fashion that all can run similar run times, not only on RICE, but also for the Autodock based projects. Was there not mention of dynamic sizing?

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Oct 26, 2009 10:04:39 PM]

martin64
Senior Cruncher
Germany
Joined: May 11, 2009
Post Count: 445
Status: Offline
Project Badges:

45 day badge for Nutritious Rice for the World

45 day badge for The Clean Energy Project

1 year badge for Help Fight Childhood Cancer

45 day badge for Influenza Antiviral Drug Search

180 day badge for Help Cure Muscular Dystrophy - Phase 2

180 day badge for Discovering Dengue Drugs - Together - Phase 2

180 day badge for The Clean Energy Project - Phase 2

180 day badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

90 day badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

1 year badge for Outsmart Ebola Together


Re: Parents, children, grandchildren WUs - how does it work?

Thanks knreed for this explanation.

Although you say that the amount of work "lost" is small, wouldn't it be possible to do the verification on a (completed) parent workunit basis, rather than on single WUs with identical range? In your example, that would mean that you would send out the same parents, but different children (starting at 5500 in the first case, at 5000 in the second). Credits could then be granted on structures that are valid in both replica.

@mreuter80, there is also some overhead in e.g. single quorum projects that are marked inconclusive, where other participants re-calculate the entire WU. It doesn't add to the results, but only to the level of reliability of the results. In HCMD2 you could say that the overhead neither contributes to the results nor to the reliability, but to the simplicity of the mechanism. wink

Doesn't mean that it has to stay this way...

Regards,
Martin

----------------------------------------

[Oct 26, 2009 10:40:35 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Parents, children, grandchildren WUs - how does it work?

Don't think BOINC distribution system is able to save up many little tasks and than do a batch wide validation if that's what you meant. Sizing tasks down is not an option as it creates too much network load... the reason the jobs were sized up. The optimizing is in this device matching which will allow also to decide sending the heavier tasks of some sciences to the more powerful machines. There was talk some months ago to have a profile option such as heavy/light work plus some other bolts such as preferring longer or shorter. I forget the exact plan but it's kind of in the vein of getting closer to what can be done at Rosetta: Give me 2-4-6-12-24 hour jobs, not all sciences suitable to that approach.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Oct 26, 2009 11:09:43 PM]

martin64
Senior Cruncher
Germany
Joined: May 11, 2009
Post Count: 445
Status: Offline
Project Badges:


Re: Parents, children, grandchildren WUs - how does it work?

Don't think BOINC distribution system is able to save up many little tasks and than do a batch wide validation if that's what you meant.

More or less, yes. So it might still make sense to send out the 2 identical WUs to 2 computers of more or less identical speed, thus reducing the risk of having a 10-year old Pentium compete with an overclocked i7 extreme, with a lot of computer time wasted. You already indicated that this is under consideration.

Regards,
Martin

----------------------------------------

[Oct 27, 2009 10:25:38 AM]

knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:


Re: Parents, children, grandchildren WUs - how does it work?

Martin,

It would require substantial re-writing of BOINC to be able to do as you are saying (i.e. after the two results are returned, send out a third that only completes the work not done by the shorter). I agree that this would be the best - but it simply wasn't feasible.

Sekerob,

Both bits of logic will be useful. The framework to have different size workunits that are then targeted at computers that can process them in a reasonable amount of real-world time is in the latest version of the server code. We need to apply the latest updates to get it.

However, for some projects like HCMD2 our estimates of difficulty are so inaccurate that we cannot use that mechanism in quite the same way. We can instead simply says workunits A,B,C will be processed by 'powerful' computers, workunits D,E will be processed by 'average' computers, and workunit F will be processed by 'less powerful' computers.

We are likely to implement a mechanism to handle this second case sooner and then work with BOINC to implement several things that were discussed at the conference that will reduce the time it takes workunits (and batches) to complete. These changes benefit the members because credit will be awarded faster, it will benefit us because it will reduce rows on the result and workunits tables and reduce file system usage and it will benefit the researchers because it will get them their results a few days quicker.

[Oct 27, 2009 1:04:30 PM]

Mysteron347
Senior Cruncher
Australia
Joined: Apr 28, 2007
Post Count: 179
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Help Cure Muscular Dystrophy

14 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

5 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

45 day badge for Uncovering Genome Mysteries

5 year badge for Microbiome Immunity Project

5 year badge for OpenPandemics - COVID-19


Re: Parents, children, grandchildren WUs - how does it work?

Good description of the current strategy :)

And this explains why the child tasks seem to be consistent in size (processing time) - because the child-task-size is determined by the original processing of the parent.

As I indicated, I've observed a 4:1 speed-ratio in my partners. I believe this to be a problem since the slower machine will always determine the amount of work discarded.

What is required, in my view, is that each pair of machines applied to a workunit (of any generation) be as closely matched as possible.

If a parent is processed by a FAST pair, then the child size may be such that a SLOW pair would need to generate a grandchild to complete the work; provided the SLOW pair is matched, the discarded results would be minimised.

Equally,if the parent is processed by a SLOW pair, the children may be more numerous, but smaller and no grandchildren should be generated. (a FAST pair would run to completion and a SLOW pair should get an extension granted by the "I'm nearly finished" mechanism.)

A reasonable indication of the relative speed of the machine in question would be (structures processed in last n tasks)/(CPU time taken in last n tasks) - and these figures appear to be easily available...

Remember, reducing waste is equivalent to bringing possibly thousands of new processors on-line. You could even claim it's a "green" initiative...

[Oct 27, 2009 4:15:25 PM]

themoonscrescent
Veteran Cruncher
UK
Joined: Jul 1, 2006
Post Count: 1320
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

5 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

5 year badge for Drug Search for Leishmaniasis

2 year badge for Computing for Sustainable Water

100 year badge for Mapping Cancer Markers

20 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: Parents, children, grandchildren WUs - how does it work?

Please excuse my lack of knowledge regarding HCMD2, but why is there a cut-off point for any system??

I'm sure there's a good reason for it, but without knowing why, as long as the result is returned within the 10 days', it seems strange to have the work unit cut off after 6/12 hours of crunching?

----------------------------------------

[Oct 27, 2009 8:32:37 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Parents, children, grandchildren WUs - how does it work?

Hello themoonscrescent ,
You are a little unusual. When WCG was first started up, we quickly discovered that a large vocal segment of our members detest lengthy work units. We try to accommodate them. It does not make any real difference to the science.

Lawrence

[Oct 27, 2009 10:04:30 PM]

KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1679
Status: Offline
Project Badges:

180 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

5 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

10 year badge for Help Fight Childhood Cancer

2 year badge for Influenza Antiviral Drug Search

20 year badge for Help Cure Muscular Dystrophy - Phase 2

10 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project


Re: Parents, children, grandchildren WUs - how does it work?

Hello themoonscrescent,
You have to think of people who are not keeping their system running 24 hours per day. Such people do not like having very long work units. Additionally, each turn off/on causes a restart at the last checkpoint which slows down again the computation of the work unit.
For all these good reasons, WCG designs the "parent / children / grandchildren" approach even if the related validation process is not particularly trivial.
Cheers,
Yves

----------------------------------------

Décrypthon team progress - KerSamson's contribution

[Oct 27, 2009 10:17:00 PM]

[ ]