World Community Grid - View Thread

World Community Grid Forums

Category: Active Research

Forum: Africa Rainfall Project

Thread: Work Available

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 3593

[ ]

Author

This topic has been viewed 5830267 times and has 3592 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Work Available

I very nearly started another thread with a discussion on reliability, and maybe I should have. If so, I apologise.

Let me just say that I absolutely understand and agree with the scientists/techs desire to get results back quickly sometimes. That's a given! What I have a problem with is 'hidden' targets. It's bad enough that they are hidden from us, the crunchers -- at least some of us can tease them out in these discussions -- but my main issue is that they are hidden from the software that runs on our machines. Most people, even us frequent chatterers, effectively run our machines in 'set and forget' mode, just tweaking them once in a blue moon. Any targets ought to be set in a way that our machines know about and can react to. That way the targets become realisable, and are not just random and arbitrary. Far more machines will hit a target that they strive to reach than will hit a target that just sits in the air. In the long run, the grid (and the science) will perform better that way.

Let's be honest, if the techs really think the current way is best in the present circumstances then perhaps it is. But it's my belief that if it's possible to set up the system to deal with such circumstances in a sensible, reactive way, then that is what should be done. Software evolves over time to meet changing circumstances, and WCG uses BOINC a little differently to other projects, so they have to be creative. But they can also feed their requirements into the future development of BOINC (or, even, do it themselves). I'm just asking that some (more) thought be given to this area. If the techs can come up with a way that uses the current abilities of the client software to better meet their goals for the science, then they should do so.

[Nov 27, 2019 11:13:10 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Work Available

When I read through this thread I get the impression that the dilemma is (A) we need to return ARP results as fast as possible and (B) many of us, including me, don't want to run without a work cache. My idea is to make the deadline for ARP tasks shorter than for other tasks, say five days. That way I could still have a cache of one or two days for other work but selectively trigger panic mode for ARP tasks, making them bypass the queue.

The 'panic' state can be triggered automatically by setting a fake cache (aka connect every) to half the deadline of ARP. Given that the website profiles can be set to a hard number of tasks to buffer, of which the client is unaware, there will be continuous attempts to connect.

One warning with that, IIRC. if ALL threads of BOINC run in high priority the work fetching stops too.

BOINC development seems to be pretty much 'maintenance' and -volunteers- only, have no high expectations of anything advanced happening on both client and server side software.

[Nov 27, 2019 12:54:30 PM]

Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

45 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

90 day badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: Work Available

DrMaason refers to using Device Profiles to control the work that he does, but that only controls the volume of WUs held in cache.

I combine that with app_config which controls the WUs actually being crunched. To allow for shortages, I usually set those limits to total 1 or 2 more than the 8 threads that I have on my PC. It is also useful to allow for high level 3 issues or RAM or any other capacity limitations.

Mike

[Nov 27, 2019 3:33:41 PM]

uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding

2 year badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

20 year badge for Nutritious Rice for the World

2 year badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

2 year badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

10 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

20 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

50 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

100 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

50 year badge for OpenPandemics - COVID-19


Re: Work Available

Dang, this conversation got deep quickly :)

Api, to give you an idea on why we use that avg turn around is due to caches. You can set your cache to 10 days if you wanted to. This means that even though your machine is reliable, it may have the turn around of 10 days per work unit. If we set the deadline to 35% of a 10 day deadline, your machine would be putting results in higher priority more often. Now, you're still considered a reliable host, but we have a slew of results that need reliable, meaning that your queue is now filled with 10 days of results (I know extreme case, but for my point) that need to be returned in 3.5 days...you have 6.5 days of work that is going to be 'too late'. These work units would then need to be resent to another host after your deadline is missed to a machine that is reliable and your machine would now be marked as unreliable because you had results considered bad.

The avg turn around was set many moons ago...when most of our applications were around the 4-6 hour range, this meant it was set for 6-9 times the avg workunit return pace. With these larger workunits that was now closer to 1.5 times a workunit length....thus needing to be tested at a higher time.

There was discussion to make this setting at the application level instead of project wide, but we are experimenting with the setting change as adding that feature would take considerably more time to write and test. Changing the setting for our smaller project runtimes shouldn't affect them much, but we are watching those as well. This setting should increase the storage needed on our backend since it'll cause batches to return slower as mentioned above.

What are these hidden targets you're looking for? The setting for becoming reliable wasn't purposefully hidden, but more technical then say the average user (99%) would care to know about. Are there other hidden targets you're wanting to know about?

DrMason, welcome to the forums. Thanks to you and hchc for constructive conversation.

Thanks,
-Uplinger

[Nov 27, 2019 4:17:49 PM]

Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:

14 day badge for Help Cure Muscular Dystrophy

2 year badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

10 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

45 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

5 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

200 year badge for Mapping Cancer Markers

10 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Re: Work Available

Dang, this conversation got deep quickly :)

That is because some of us are intensely interested and want to optimize the efficiency of our crunching. I, for one, appreciate when any of the techs chime in with additional "backend" information on any of the projects or how WCG is dealing with various issues.
Thanks Uplinger
Cheers

----------------------------------------

Sgt. Joe
*Minnesota Crunchers*

[Nov 27, 2019 7:40:52 PM]

flensr
Cruncher
Joined: Oct 31, 2018
Post Count: 25
Status: Offline
Project Badges:

45 day badge for FightAIDS@Home - Phase 2

180 day badge for Microbiome Immunity Project

2 year badge for OpenPandemics - COVID-19


Re: Work Available

So then... How can I get my first WU for this project? Not one assigned WU yet, not sure what I need to do. Can't be reliable or unreliable if I can't even get one WU.

----------------------------------------

[Nov 27, 2019 9:44:59 PM]

floyd
Cruncher
Joined: May 28, 2016
Post Count: 47
Status: Offline
Project Badges:

20 year badge for Mapping Cancer Markers

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: Work Available

Right, that's my plan. Except I'm not sure about the size of the cache. Maybe BOINC is smart enough to notice when the cache is not full. It should panic later in that case.

One warning with that, IIRC. if ALL threads of BOINC run in high priority the work fetching stops too.

Fetching work or not, if all tasks run in high priority I missed my original goal, to process ARP tasks before any other. That's why I think ARP tasks should have shorter deadlines than others, to make them switch to high priority first.

BOINC development seems to be pretty much 'maintenance' and -volunteers- only, have no high expectations of anything advanced happening on both client and server side software.

There's no software changes necessary for this, just creative use of existing mechanisms as you described above.

[Nov 27, 2019 11:45:20 PM]

Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7846
Status: Offline
Project Badges:


Re: Work Available

So then... How can I get my first WU for this project? Not one assigned WU yet, not sure what I need to do. Can't be reliable or unreliable if I can't even get one WU.

What kind of machine are you running and how many cores ? Right now the ARP work units are few and far between. There are several hundred thousand cores looking for just a couple of thousand work units each day. If you are patient, and I am sure you are, you will eventually get one.
Cheers

----------------------------------------

Sgt. Joe
*Minnesota Crunchers*

[Nov 27, 2019 11:46:37 PM]

gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 3010
Status: Offline
Project Badges:

90 day badge for Help Cure Muscular Dystrophy

90 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

1 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

1 year badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

180 day badge for Drug Search for Leishmaniasis

180 day badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries


Re: Work Available

So then... How can I get my first WU for this project? Not one assigned WU yet, not sure what I need to do. Can't be reliable or unreliable if I can't even get one WU.

Also, you haven't indicated as to whether you've tried any of the techniques already covered in this (and other) thread(s). Yes, there's currently only a limited supply of ARP WU's, although if your computer isn't trying to capture them as often as possible (i.e., not simply filling it's cache fully with other project WU's), then your chances are severely reduced.

----------------------------------------

[Nov 28, 2019 2:55:28 AM]

DCS1955
Veteran Cruncher
USA
Joined: May 24, 2016
Post Count: 668
Status: Offline
Project Badges:

14 day badge for Uncovering Genome Mysteries

10 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

20 year badge for OpenPandemics - COVID-19


Re: Work Available

I am short 3 days from gold. I got most of them at 38 min after hour, prior to the fairer introduction of randomization to HSTB & ARP. I have now gone with the route of a randomized task manager. Much less of a fish in barrel situation, but fair to everyone.

----------------------------------------

----------------------------------------
[Edit 1 times, last edit by dcs1955 at Nov 28, 2019 4:33:49 AM]

[Nov 28, 2019 4:28:40 AM]

[ ]