World Community Grid - View Thread - Long WU still occuring

World Community Grid Forums

Category: Completed Research

Forum: FightAIDS@Home

Thread: Long WU still occuring

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 27

[ ]

Author

This topic has been viewed 2498 times and has 26 replies

Papa3
Senior Cruncher
Joined: Apr 23, 2006
Post Count: 360
Status: Offline
Project Badges:

14 day badge for The Clean Energy Project - Phase 2

100 year badge for Mapping Cancer Markers

90 day badge for Uncovering Genome Mysteries

1 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

100 year badge for Microbiome Immunity Project

10 year badge for OpenPandemics - COVID-19


Re: Long WU still occuring

The point is not the download sizes or the client adjusting to the power available. What I mean is that a client that from the outset takes 72 hours of CPU time and is only on for say 6 hours a day and running at 60% throttle will never ever be able to complete the job in time, not even in 12 days. Are you going to tell these volunteers: "No you can't help WCG?".

On second reading, it seems that you may be arguing here that there are some weak clients who take enormous amounts of time to process jobs and some strong clients (Quad-core systems, etc.) that process jobs quickly, and that the weak clients will be in trouble if the average job length goes from 8 hours to 24 hours. Economically, the value of a computer purchased five years ago is approximately zero. The contribution of a five-year-old computer to WCG, even if it ran 24/7, would also be close to zero. I would indeed argue that some potential clients are not productive enough to be WCG participants. Nobody should expect to be able to bring a Radio Shack TRS-80 or some similar museum piece to the WCG effort in anticipation of acceptance. If this is indeed your argument, then please specify an example of how old and weak a computer it is that you would argue WCG should be trying to accomodate as a client.

[Oct 22, 2008 10:54:49 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Long WU still occuring

WCG still likes to facilitate the contribution of the not so powerful brethren so that those that are on for just limited hours a day can still participate and in this day and age of watching streaming TV via internet making the weight of 400k total UL/DL per job a consideration to increase job CPU times 24 hours causing to make it impossible for those to contribute is not in that spirit.

WCG created the so called Power Saving profile, so devices can go to sleep and are not on for the sake of getting one job done in 12 days running 24/7 is not of this time.

Who decides what machines are fit for WCG crunching purpose? W98 is no longer supported. Future projects will not be tested to support them, so these will eventually slip out and are like 0.03% or less of the participants.

So to sum it: 8 or 9 hours (RICE as of batch 00200 e.g.) is a size that enables for most all part time PCs to be volunteered for crunching at WCG. HPF2 is set to 20 days, the download 2mb+ per job, and from that can be opted out too. With near future 6 projects at WCG offers enough of a choice. AND as mentioned, in the future there will be the shaping of jobs to the power of the client, maintaining and fortifying the WCG policy of keeping as many on board that want to be on board.

ciao

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Oct 23, 2008 6:12:17 AM]

[Oct 23, 2008 3:45:41 AM]

Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:

14 day badge for Human Proteome Folding - Phase 2

1 year badge for Discovering Dengue Drugs - Together

45 day badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

1 year badge for Help Fight Childhood Cancer

180 day badge for Influenza Antiviral Drug Search

1 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

10 year badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

10 year badge for GO Fight Against Malaria

20 year badge for Mapping Cancer Markers

1 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

50 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

2 year badge for Microbiome Immunity Project

45 day badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: Long WUs still occurring

The contribution of a five-year-old computer to WCG, even if it ran 24/7, would also be close to zero.

papa3: I think that there are very many 5-year-old computers out there on people's desktops, and WCG's aim is to include as many contributors as possible, so that the slower machines can make a useful contribution just by sheer numbers. If and when these contributors upgrade to something faster, hopefully they'll continue to contribute. If WCG were to eliminate their machines now, they probably wouldn't be around with their faster machines in the future.
Five-year-old machines can do most things that most people want to do, anyway - surf the Net, office apps, edit photos, play standard-definition video, recode video (slowly), run Win XP. HD video and modern games are out, tho. My 6-yo AMD Athlon XP2600 now sits on a friend's desk and does everything he needs it to do. The machine used by Astrolab, the starter of this thread, is probably in this category.
The older machines use more electricity per point crunched, but the environmental footprint of frequently upgrading computers is also huge. As well as manufacturing the machine itself, the chain of transporting, distributing, retailing, installing plus disposal of the old gear all need to be included.

As for BOINC WU crunch-times, I reason that these need to be small compared to the return-time allowance. Otherwise, not only will slow crunchers be unable to complete 1 WU in time, but BOINC will have trouble with job-queue management in general. Crunchers with intermittent internet connections would sometimes exceed deadlines unless they set their job-queue short enough that they would sometimes run out of work between connection periods.
It would be very useful if the amount of crunching needed for each individual WU could be estimated in advance. Then, smaller WUs could be diverted to slower devices, and if these data were also sent to our BOINC clients, they could calculate more accurate estimates of WU completion times, which would help with queue management.
However, I don't know whether it is actually possible for the scientists and WCG to predict the WU size. In the case of FAAH, this might be particularly difficult for molecules (ligands) that have not been docked previously. I guess that ligand molecular weight, or better, atom count, might be one predictor. ((Stats/regression analysis, Scripps/Sekerob?)). Then there are other projects to consider ...
[Edits: atom count, clarity of meaning]

----------------------------------------
[Edit 7 times, last edit by Rickjb at Oct 24, 2008 4:51:43 AM]

[Oct 23, 2008 5:59:46 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Long WU still occuring

Rick,

WCG tracks the information dynamically and auto-magically returns the flop estimates of the previous day into the new work that goes out. Also each device it tracked for performance. It's working like a mechanical cruise control, the first one I had overshot the target speed and would slow down after, but eventually the balance was found. The client itself benchmarks each 5 days, so it takes a wee bit before fair balance is reached.

thanks for contributing

PS, varying toughness of dockings & folds is indeed why you see production times drifting... the experiments are following arrays of targets and compounds and each new series undulates through these, except RICE ;>)

A chart tells how hard it is to get an appropriate average job size. Lessons were taken from the monster jobs we had about 1-2 months ago with FAAH. HCC's are discreet image analysis so they a fairly steady, 1 at the time. DDDT tasks pack about 5 jobs, but unfortunately that one is on low rev.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Oct 23, 2008 6:37:48 AM]

knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding

90 day badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Computing for Clean Water

14 day badge for Uncovering Genome Mysteries

45 day badge for Outsmart Ebola Together

180 day badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

1 year badge for Africa Rainfall Project

180 day badge for OpenPandemics - COVID-19


Re: Long WU still occuring

I didn't mean to start all this confusion about run times. I just thought that after the issues that FA@H had this summer, that all the jobs were able to run about 8 hours. Unfortunately there is more flexibility in the run times for FA@H than the other projects. Now I know that FA@H needs some longer run times which cannot be unexpected. That is all I was looking for.

Do to the troubles from this summer - especially given that forecasting actual duration is very challenging, we modified the process for FightAIDS@Home so that we send a very small number of workunits out for each batch earlier so we can get a rough estimate of the actual duration.

This has improved our ability to size the workunits and reduced the risks of major workunit duration issues like we saw earlier this year.

[Oct 23, 2008 1:25:58 PM]

JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3715
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy

1 year badge for Nutritious Rice for the World

180 day badge for The Clean Energy Project

10 year badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Discovering Dengue Drugs - Together - Phase 2

180 day badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

180 day badge for Drug Search for Leishmaniasis

1 year badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

180 day badge for Africa Rainfall Project


Re: Long WU still occuring

The target should be more like 24 hours. The network traffic is about the same per job, no matter how long the job is. With the 10-day processing timeframe, an 8-hour granularity is way too fine. It generates unnecessary network traffic, which could easily be cut by 2/3 simply by using a 24-hour target length instead.

Your assumptions are not correct.

For projects whose target duration can currently be easily adjusted, Rice and DDDT, the size of output files is proportional to the duration, same for the input files of DDDT. And for Rice the size is also proportional to the power of the processor, so if you want to enforce long durations for those fast processors too, the output files might become huge.

For docking-like projects, FAAH and HPF2, stating that the size of files would not be affected means that you would push the computation further. But if scientists are satisfied with the current durations that means that pushing computation would simply be a waste of time and that the productivity of these projects would be reduced in the same proportion that you would increase each WUs duration.

Last, HCC. This project does picture analysis whose duration is more or less constant for a given processor. To increase the duration of HCC WUs the techs would have to pack several elementary analysis in a single WU. Since they already did it for DDDT that would probably not be too much of a challenge for them. But in that case it is obvious that the size of input and output files would also be proportional to the number of analysis packed in a single WU.

Cheers. Jean.

----------------------------------------

Team--> Decrypthon -->Statistics/Join -->Thread

[Oct 23, 2008 2:09:53 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Long WUs still occurring

The machine used by Astrolab, the starter of this thread, is probably in this category

I have a broad range of systems ranging from a Sempron and Celeron up to a few pretty quick Athlon X2 64s. My concern was machine monitoring where a Celeron gets a FA@H job that requires 25 hours of crunch time and the machine is running at 70% and the computer is not on all the time. Its a pain to have to monitor this computer for 3 days to see if there is a problem with a single job.

My question was really, although I did not say it, can the system not allocate long WU to slow computers (especially when they are not 24x7), recognizing that the determination of what a long WU is and what a slow computer is are both arbitrary?

I am guessing that the answer is NO because a) The volume of long WU is low and b) as long as the WU is complete by the due date, WCG does not need to care how long the WU runs on one computer, which is an ok answer, I guess. That is, as long as the error rate on the long WU is very low.

[Oct 23, 2008 2:17:11 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Long WUs still occurring

astroLab,

As stated earlier in this thread this is going to be a future feature... to be able to send work cut to size for fast and slow machines.

cheers

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Oct 23, 2008 2:21:40 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Long WUs still occurring

astroLab,

As stated earlier in this thread this is going to be a future feature... to be able to send work cut to size for fast and slow machines.

Thanks Sek. I'm done with this thread

[Oct 24, 2008 1:14:10 PM]

Papa3
Senior Cruncher
Joined: Apr 23, 2006
Post Count: 360
Status: Offline
Project Badges:


Re: Long WU still occuring

Your assumption that I was referring to any project other than FaaH is incorrect. Please note that this thread exists within the FaaH forum. Thus, statements made within it refer to Faah unless otherwise specified.

[Oct 24, 2008 8:03:46 PM]

[ ]