World Community Grid - View Thread - Do not think it helps to stock up large caches...

World Community Grid Forums

Category: Completed Research

Forum: Help Cure Muscular Dystrophy - Phase 2 Forum

Thread: Do not think it helps to stock up large caches...

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 39

[ ]

Author

This topic has been viewed 4718 times and has 38 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Do not think it helps to stock up large caches...

Sorry Sek, was not intended as it came across, got pulled away and did not finish my thought train.

I also have seen this, and also suffer with a PV list that seems to always have a few wingmen go no reply after waiting a week or longer to validate. had 3 clear today 2 from retreads, 1 late return. My personal feels are that this happens most on mondays on jobs with weekend complete by times. maybe boxes get shut down friday night and don't get restarted until monday am??? I know we have a good amount of corp machines crunching, that follow this pattern.

Would like to see some type of check in system for jobs that are uploaded, but not yet "Reported", not sure if all data needed is uploaded at completion or not. Could have saved at least one job that I know of from getting reissued.

"Server Abort" does work on WCG, only if client reports to server, AND job not yet started. also, on reissues, if late job comes in, reissue will get "Server Abort" IF contacts server AND not started.

Systems that are only on part of the day, day workers etc, the client does adjust the cache accordingly. ie on 12 hrs, off 12 hrs, cache setting of 1.5 days, client only downloads approx 18 hrs of work. within the constrains of the DCF that is ;)

I have been crunching here almost 3 years, never ran out of work. I have intermittant systems, some even on dial up that only connect when they need to. still never had a need for large cache. only reason I use 1.5 days is becuase it gives me smallest PV list. 95% of all WU return in 2-3 days.

EDIT: only thing I see a 10 day cache doing is holding jobs on that computer for 8-9 days before they get started, where they could have gone to someone else and been back a week sooner confused

----------------------------------------
[Edit 1 times, last edit by Former Member at Apr 20, 2010 12:25:51 AM]

[Apr 20, 2010 12:21:32 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Do not think it helps to stock up large caches...

I set my cache for 0.1 days.
I can't remember a time when I've run out of work.
There is that check box to also prevent such a thing...you know what box. shame on you

[Apr 20, 2010 12:40:52 AM]

Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7699
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

14 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

10 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

45 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

5 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Re: Do not think it helps to stock up large caches...

I have machines which are only intermittently connected to the internet. On these machines I use 4 and 5 day cache depending on the speed of the machine. This gives room for some margin of error. I take my modem and and connect these machines about that often. I expect these machines to crunch 24/7, but once in awhile something will happen, such as a power outage, hard drive failure, etc. and I lose jobs. Technical glitches so to speak.

Given the number of machines on WCG it is statistically probable that there will be some small percentage everyday which do not return work for various reasons. The system is designed to take this factor into account with re-issues etc.

So, as the song says "Don't worry, be happy." The work will get crunched, just not as fast as we would all like it to.

Cheers

----------------------------------------

Sgt. Joe
*Minnesota Crunchers*

[Apr 20, 2010 1:23:04 AM]

Mysteron347
Senior Cruncher
Australia
Joined: Apr 28, 2007
Post Count: 179
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

1 year badge for Help Fight Childhood Cancer

5 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

45 day badge for Uncovering Genome Mysteries

1 year badge for Outsmart Ebola Together

5 year badge for Microbiome Immunity Project

1 year badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: Do not think it helps to stock up large caches...

I'd go along with Sgt. Joe on this, and I've reset my cache to 5 days.

If a 10-day cache "near guarantees" late-reports, then this indicates that the job run-time is being consistently underestimated. With such a large cache, the long jobs should cancel the short.

Certainly, manual cancellation simply delays the WU completion overall. I found I was cancelling a few jobs once a month or so, and reducing the cache should make that unnecessary.

Again, we should be careful not to generalise too much. Since I'm unemployed, I have plenty of time to monitor progress and compensate - gives me something to do. An unattended installation will be different no doubt.

As for "How often do these floods/rains/snow/fires ISP collapses occur on your end and what has been their max days calamitous duration?"

Er, well, floods aren't a problem here for me. Gets a bit hard even for Mother Nature to flood the Indian Ocean - and we'd have a few more urgent problems than WCG runtimes if that occurred.

Rain? That's another matter. The first rains of winter tend to wash out the dust accumulated on the insulators over Summer. Normally leads to a short power-outage, sometimes hours but can be a day or more. Power is more often interrupted by the latest speeding clown wrapping themselves around a power pole though. That has been known to happen every Saturday night sometimes for weeks on end.

Snow? Snow within 1000Km would be all over the news. More of a problem in Canadia and Russia and Northern Europe, I hear. Did have a bit of a hailstorm a few days ago though. Didn't affect me, but some places not too far away were blacked out for a couple of days. Same places were affected similarly by fires a few months back. Not as bad as the Victorian fires last year which coincided with the Queensland floods - and there have been more floods in the East this year. Priority for some reason is to get the people and animals safe first, then make sure they're supplied with beer and food. Inexplicably, restoring internets is further down on the list...

[Apr 20, 2010 2:57:27 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Do not think it helps to stock up large caches...

So, as the song says "Don't worry, be happy." The work will get crunched, just not as fast as we would all like it to.

Yes it will get crunched, but the point here is we are unnecessarily wasting other users' CPU time with make-up work because the 10 day cache = 10 day deadline.

[Apr 20, 2010 4:05:28 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Do not think it helps to stock up large caches...

another "Feature" that is frustrating is that the job I was No Reply on, was actually completed and uploaded 8 days prior, but sat in "Ready to Report" as client machine was turned off.

I've never understood this behavior in BOINC. I don't see the point in this 2-step activity to report a completed WU (Finish WU -> Ready to Report -> Sent). confused

[Apr 20, 2010 4:39:27 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Do not think it helps to stock up large caches...

another "Feature" that is frustrating is that the job I was No Reply on, was actually completed and uploaded 8 days prior, but sat in "Ready to Report" as client machine was turned off.

I've never understood this behavior in BOINC. I don't see the point in this 2-step activity to report a completed WU (Finish WU -> Ready to Report -> Sent). confused

The 2 step is for a good reason... to make absolutely sure that what the client sent is the same as what the servers received for one and the second being, the first is just a flat database taking in the data records. The second part goes to the highly taxed scheduler that determines what you have / had / are receiving. You don't want like yesterday 635,000 unique hits on the scheduler doing all the calculations for each (7.4 times per second). You like to combine multiple Ready to Reports (RtR) so the scheduler can handle that as a single transaction.

In the exampled case, the RtR would have at latest reported within 24 hours, but absent any contact, nothing can be done. If one knows these road trips to happen, briefly look in and hit the Update button again. That's what I do before closing the lid.

Very much in a nutshell. Plz read the BOINC Wiki's and FAQ's for more info if interested.

PS: fredski..., there's incentive. The fast returners are as much as possible for quorum requirements matched to fast returners... at least, I'm observing dramatic improvement from what it was before. Not allot of work in PV jail at any one time. Those that don't get matched is simply when a job was send to number 1, the scheduler will only hold it a limited time, then send number 2 to anyone who asks for a job. Even HPF2, which requires 15 minimum to start quorum checking is usually complete here within 24 hours. Who can resist that? For WCG it means hundreds of thousands less in PV jail held on the scheduler and at least an equal number In Progress less if we can get that cache number down collectively.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Apr 20, 2010 5:58:46 AM]

[Apr 20, 2010 5:47:58 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Do not think it helps to stock up large caches...

Mysteron347 wrote snip

If a 10-day cache "near guarantees" late-reports, then this indicates that the job run-time is being consistently underestimated. With such a large cache, the long jobs should cancel the short.

With HFCC/FAAH/HCMD2/DDDT2 is near impossible to get exact estimates... remember most we calculate is non-deterministic. Statistically one would expect long and short to cancel out, but the controls work on the current series of work to guesstimate the TTC of the rest so if there's a short series of HCMD2 grand children that came with a 4.5 hour estimate [project running average] and they take just 1-3 hours, the backfill is affected. Then when the parents come in again with 6-12 hours run time... panic and that changes client behavior for longer and we see posts about "Why is my client not getting work?"

Not going to expand on this: All forums frequenters have seen many discussions on Duration Correction Factor (DCF) blow outs on client side due the variability. I'm reading of future developments (following the developers check-in list and saw a few cheers by knreed elsewhere) that will further mitigate, but that will be next year earliest.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Apr 20, 2010 6:19:45 AM]

[Apr 20, 2010 6:10:35 AM]

PecosRiverM
Veteran Cruncher
The Great State of Texas
Joined: Apr 27, 2007
Post Count: 1054
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy

1 year badge for Discovering Dengue Drugs - Together

5 year badge for Nutritious Rice for the World

2 year badge for The Clean Energy Project

2 year badge for Influenza Antiviral Drug Search

10 year badge for Help Cure Muscular Dystrophy - Phase 2

5 year badge for Discovering Dengue Drugs - Together - Phase 2

10 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

10 year badge for GO Fight Against Malaria

5 year badge for Computing for Sustainable Water

100 year badge for Mapping Cancer Markers

50 year badge for Uncovering Genome Mysteries

100 year badge for Outsmart Ebola Together

20 year badge for Microbiome Immunity Project


Re: Do not think it helps to stock up large caches...

My problem I just bumped my cache (going out of town). Got 6 repair jobs on a slow duo (30hr/WU). I sure hope they run alittle faster then listed.

----------------------------------------

[Apr 20, 2010 6:10:20 PM]

nasher
Veteran Cruncher
USA
Joined: Dec 2, 2005
Post Count: 1423
Status: Offline
Project Badges:

90 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

180 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for GO Fight Against Malaria

2 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

10 year badge for OpenPandemics - COVID-19


Re: Do not think it helps to stock up large caches...

yes that is the bigest problem with this job is the inability to judge ahead of time how big a work unit should be. if at all posible keep your machine connected at all times and a very small catch and that should keep you crunchig well. if you cant then you cant.

honestly if a work unit dosnt get returned then it dosnt get returned and someone else will crunch it.

i cant remember what project it was but at one time i had over 8 pages of Pending Validations with just 4 cpu cores total running.

back when there were RICE units to crunch it was easy to figure out how many you needed cause they were basicaly a set time.

crunch and be happy... and go for the next level of badge

----------------------------------------

[Apr 21, 2010 6:22:41 PM]

[ ]