World Community Grid - View Thread - Let's discuss this one more time

World Community Grid Forums

Category: Support

Forum: Suggestions / Feedback

Thread: Let's discuss this one more time

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 42

[ ]

Author

This topic has been viewed 5697 times and has 41 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Let's discuss this one more time

Sekerob,

I chose 18000 seconds is 12.5 days.

I'd be more than surprised if I ever see a WU that takes that duration in time.

Checkpoints? We don't need no stinking checkpoints! biggrin

All they do is consume time that my CPU's could be working on something interesting.

[Jul 3, 2009 1:06:23 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Let's discuss this one more time

Seconds, not minutes! A day has 86,400. IF you are on client 6.6. you have to multiply the 18000 with the number of concurrent threads running. On an I7 + HT in fact multiplied by 8. In client 6.2 there was some logic left, 18000 seconds per concurrent job.

Very off topic, sorry, but you're still doing better for WCG given that the time between fetch and cashing points for you is less than my system.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Jul 3, 2009 1:19:33 PM]

Greg Lyke
Advanced Cruncher
Joined: May 30, 2008
Post Count: 50
Status: Offline


Re: Let's discuss this one more time

That 18000 number jumped out at me also as all of a sudden I was questioning my understanding of the purpose of it.

Now that Barney's reasoning has been explained (nice machine btw), how much time really is spent writing to the disk? In a stable & reliable (but much older & slower) machine, what is a good figure to put there? I have it set to 30 seconds mainly because constantly saving/backing up my work has been beaten into my head due to a lifetime of somewhat unreliable machines & power grids. Would there be a significant decrease in crunching time if that variable was increased to several minutes or even an hour?

[Jul 3, 2009 1:21:33 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Let's discuss this one more time

Seconds, not minutes!

DOH!!!! I know that!!! I just haven't had enough Coffee (usually two pots to wake me up)

A day has 86,400. IF you are on client 6.6. you have to multiply the 18000 with the number of concurrent threads running. On an I7 + HT in fact multiplied by 8. In client 6.2 there was some logic left, 18000 seconds per concurrent job.

That's an interesting aspect I didn't know.. thanks for the enlightenment

Very off topic, sorry, but you're still doing better for WCG given that the time between fetch and cashing points for you is less than my system.

oh well.. BTW, all the systems I build are sitting behind some BIG UPS systems. Ok Ok, the monitor goes dark but I don't care. This system will continue on for about 16 Hrs before the UPS gives up the ghost and the system will tumble to the ground.

[Jul 3, 2009 1:31:11 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Let's discuss this one more time

Likely not enough for the average person to measure in a day, unless you are counterpointing significantly more frequently than about 30 minutes.

In a stable & reliable (but much older & slower) machine, what is a good figure to put there? I have it set to 30 seconds mainly because constantly saving/backing up my work has been beaten into my head due to a lifetime of somewhat unreliable machines & power grids. Would there be a significant decrease in crunching time if that variable was increased to several minutes or even an hour?

Me, Myself n' I, would choose 30 minutes. having to restart back 30 minutes from the last checkpoint is no real biggie. Checkpointing every 30 seconds.... Um... er.... I wouldn't do it that frequently. Now I might make an exception to the every 30 seconds if it took me say 48 hrs CPU time for a WU to complete.

But I'm sure there are others here that have much better answers.

BTW; for the record; If I'm about to do something that I know will force a restart; I change the checkpointing to about 30 seconds and let it run for 5 mins. Just so when I come back up; I'm close enough.

After the system comes up, and starts to churn again, I turn the checkpointing off completely again.

----------------------------------------
[Edit 1 times, last edit by Former Member at Jul 3, 2009 1:40:35 PM]

[Jul 3, 2009 1:37:27 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Let's discuss this one more time

We have:

1. FAQ for work-file sizes
2. FAQ for checkpoint frequencies.

You can set it to 30 seconds, but if the science is internally not doing one but every 20 minutes, usually when a seed/docking/position/attempt is complete, it wont do it more often. If the work file is large at checkpoint time, you'll notice, particularly on a busy system that's tight on RAM. For HCMD though, smaller than small?

Default is 60 seconds, thus effective 240 seconds for a quad under 6.6. I wanted to achieve an at most of 5 minutes, so it's 75 seconds. For 6.2.28 that value would have to be 300. But what is wisdom here?

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Jul 3, 2009 1:40:19 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Let's discuss this one more time

Sekerob,

Thanks for that info.

Is there any way to have the client take the data on the WU's and spit it out to a spread sheet for us?

There's absolutely nothing there that identifies anyone so I can't imagine any PI / SPI issues at all, but then again, I'm a neophyte

[Jul 3, 2009 1:45:09 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Let's discuss this one more time

Other than adding an xml code function request to the link for easy export of the total RS content, not copy/paste per 15 results, unaware of anything better. I'm content with BOINCview for listing and filtering on project andrun time hours and averages.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Jul 3, 2009 2:06:38 PM]

gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 3010
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

90 day badge for Help Cure Muscular Dystrophy

90 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

1 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

1 year badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

180 day badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

180 day badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: Let's discuss this one more time

BarneyBadass

Firstly, I'm sorry to hear that you've got someone close to you that would really benefit from any advances that this project can bring. Obviously, we're all hoping that each and every one of these projects will bring some benefit/advances towards cures etc., in as short as time as possible. I suppose the one REAL thing each and everyone of us can do, is to recruit more people to WCG - that way, more WU's will be crunched and thus, get the results back into the scientists hands as quickly as possible.

As to running projects split up onto different processors etc., yes, that'd be good (especially now that more and more multi-core processors are coming to the market), although if and when BOINC would ever get around to providing that functionality, who knows...

Finally, as to escaping PV purgatory, there is one way - and that is to only crunch single quorum projects - knowing that, by the shear weight of WCG, others will be crunching HCMD2 WU's. Okay, you probably wouldn't get your Gold badge - but, by the sounds of it, that's not what you're crunching this project for.

----------------------------------------

[Jul 3, 2009 2:16:56 PM]

JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:

45 day badge for Help Cure Muscular Dystrophy

1 year badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

180 day badge for The Clean Energy Project

1 year badge for Help Fight Childhood Cancer

180 day badge for Influenza Antiviral Drug Search

10 year badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Discovering Dengue Drugs - Together - Phase 2

180 day badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

1 year badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

5 year badge for Outsmart Ebola Together

5 year badge for Microbiome Immunity Project

180 day badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: Let's discuss this one more time

BTW; for the record; If I'm about to do something that I know will force a restart; I change the checkpointing to about 30 seconds and let it run for 5 mins. Just so when I come back up; I'm close enough.

After the system comes up, and starts to churn again, I turn the checkpointing off completely again.

Hi Barney!
Your changing the write-to-disk interval just "when you need" does not work, unfortunately, and you could check it if you had set your cc_config.xml file to have chechpoints logged in your messages.
Applications query the Boinc client about this parameter only once when they start or restart. Then they use this value till the end of the run (completion or shutdown).

A few other comments on your topic.
I usually work with a short queue (0.15 day currently, and crunching HCMD2 only) and thus I have many WUs in PV status too. But on average I seldom have more than one day of work in PV, i.e. 6 days of runtime since I have six WUs active at a time. Last time I made a snapshot of my Result Status pages the PV tasks were totalling 6.5 days of runtime. And when there are more it is when the estimated complexity of jobs is far from reality like when HCMD2 started recently, and in this case having a dedicated feeder for fast returners would be defeated too.

Over the duration of a project the time WUs stay in PV status has no noticeable influence on the duration of the project: we are talking of some more days of delay for each WU versus several months or years for the project. Practically if the average PV time is increased by one or two weeks the total project will finish one or two weeks later, that's all.

I think your suggestion is interesting, particularly for reducing the number of entries in the database at a given time, but I am not sure that the added complexity makes it really practical. Also the techs have other simpler means to reduce the number of open WUs in their database if they need, for example by increasing the average duration of WUs.

But that does not forbid discussing about it, obviously. Jean. smile

----------------------------------------

Team--> Decrypthon -->Statistics/Join -->Thread

----------------------------------------
[Edit 1 times, last edit by JmBoullier at Jul 3, 2009 7:21:10 PM]

[Jul 3, 2009 7:19:23 PM]

[ ]