Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 39
Posts: 39   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3880 times and has 38 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Do not think it helps to stock up large caches...

... unless there was a technical glitch for not starting the job in time:

CMD2_ 0375-1I7X_ A.clustersOccur-2CN5_ A.clustersOccur_ 74_ 514520_ 516388_ 2-- 614 Valid 18-4-10 17:08:32 19-4-10 15:02:15 5.52 101.4 / 81.5 < moi
CMD2_ 0375-1I7X_ A.clustersOccur-2CN5_ A.clustersOccur_ 74_ 514520_ 516388_ 1-- 614 Valid 8-4-10 17:13:07 19-4-10 08:18:06 3.78 74.6 / 81.5 < A Too Late No Reply
CMD2_ 0375-1I7X_ A.clustersOccur-2CN5_ A.clustersOccur_ 74_ 514520_ 516388_ 0-- 614 Valid 8-4-10 17:08:00 11-4-10 15:50:21 5.75 88.3 / 81.5

Plz abort if you think you can't make the deadline. My computer wasted 5.52 hours computing this 3rd, redundant copy.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Apr 19, 2010 3:25:07 PM]
[Apr 19, 2010 3:22:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mysteron347
Senior Cruncher
Australia
Joined: Apr 28, 2007
Post Count: 179
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

Well - I'd disagree.

If you had a VERY large cache (I use 10 days) then a task sent 20100418 would be unlikely to be started 20100419.

Is the 'server abort' facility active on WCG - and if so, does it work? Was an abort sent by WCG when the 'too late' unit was received? I've seen other projects abort tasks. Is it used? Is it working? Does it work on a started task?

Or, for future consideration, should BOINC allow retrieval of a progress report so the deadline for a 'late' unit could be extended if progress is being made?

Perhaps the abort message wasn't sent because of the server problem at WCG to which knreed refers in the 'network outage' sticky - sadly, there's no reference to the actual times of the outage.

I'd suggest regular forum-readers are well aware that a few particularly chewy units can push others over the deadline, and monitor progress closely.

Could a large cache be the solution to Novosel's "shut down pc options" thread in the BOINC Agent Support forum? 5-6 days should ensure that there's always work available, and that should also be short enough to ensure that the work is done before the deadline, even when unusually long runtimes are encountered.

A small cache might suit machines that are permanently-connected under normal circumstances, but we've had outages that have lasted days (as have other BOINC projects) and when that happens (a long outage either end) a small cache is not as good as a large one. All sorts of problems can occur at either end - fire, flood, snow, equipment failure. Even some clown digging up a fibre-optic cable...

Overall, I'd say a nice chunky cache is preferable to a small one.
[Apr 19, 2010 6:32:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

Running a 10 days cache is near guaranteed to cause tasks to be reported late on a frequent bases. Whilst the server abort works fine, it does require for that repair client [which is known to return a result within 48 hours and more than often within 24 hours] to also talk to the server and NOT having started the job yet, thus it's been a waste of my computer time not furthering the progress of the project, and not the first time. With the high variability of run times for the batches I'd say that probably 7-8 is for this project max.

From WCG's end, last time there was a > 24 hours down was planned during the 2009 move of the servers to a Canadian based center. How often do these floods/rains/snow/fires ISP collapses occur on your end and what has been their max days calamitous duration?

Future client: I'm absolutely all for a client aborting tasks that it foresees to not get completed by deadline... if the situation abates, the client can always ask for new work to be backfilled, but that's not the here and now. I'm asking to keep it down to a level that other members do not be faced with the No Reply, Too Late and still reported quorum 3 where quorum 2 suffices.

The post I made was well before the outage, i.e. it had no relation to this one either.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Apr 19, 2010 6:56:50 PM]
[Apr 19, 2010 6:52:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

Running a 10 days cache is near guaranteed to cause tasks to be reported late on a frequent bases.

If that is the case, then why do we allow people to cache that much in the first place? Why not have a 10 day deadline, but allow, for instance, a max of 7 days cache?
[Apr 19, 2010 7:00:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

PS, the more tasks in your cache, the more time the boinc.exe core client slobs up CPU time, particularly if the BOINC Manager is open. The list is refreshed every second. That number of tasks can cause even that much response delay that the BOINC Manager will have trouble connecting to the core client so badly it can throw time outs, not even sure if it may co-cause the "if this happens frequently, consider to reset the project".
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Apr 19, 2010 7:04:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

Running a 10 days cache is near guaranteed to cause tasks to be reported late on a frequent bases.

If that is the case, then why do we allow people to cache that much in the first place? Why not have a 10 day deadline, but allow, for instance, a max of 7 days cache?

It's the limitation of the BOINC client which happens to coincide with the WCG general deadline... which is one set to allow most all part time crunchers to complete a task comfortably. Not sure if WCG could even limit that cached work provision. It will give a max of 15 per call, 80 per core per day as a generalized rule to cater for even the most powerful devices.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Apr 19, 2010 7:10:55 PM]
[Apr 19, 2010 7:09:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

Running a 10 days cache is near guaranteed to cause tasks to be reported late on a frequent bases.

If that is the case, then why do we allow people to cache that much in the first place? Why not have a 10 day deadline, but allow, for instance, a max of 7 days cache?

It's the limitation of the BOINC client which happens to coincide with the WCG general deadline... which is one set to allow most all part time crunchers to complete a task comfortably. Not sure if WCG could even limit that cached work provision. It will give a max of 15 per call, 80 per core per day as a generalized rule to cater for even the most powerful devices.

I wonder what the % of WUs returned with "too late" status? Seems like a waste of resources to me. One would think there should be some kind of server-side code to say "Sorry you are requesting more than the maximum allowed cached work of 7 days; sending 7 days of work."
[Apr 19, 2010 8:36:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Movieman
Veteran Cruncher
Joined: Sep 9, 2006
Post Count: 1042
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

I cache 3 days work.
That should cover 99.999999% of the outages that WCG could possibly have and also keep the amount of cached work "under control" in case I need to pull a machine down for a day for repair or upgrade..
----------------------------------------

[Apr 19, 2010 8:46:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

don't be too quick to blame an oversized cache, I run 1.5 days, and have several machines that recently went no reply then valid due to machine being off while owner was out of town. These boxes are normally on 24/7 but get turned off when they have to leave town. they recently got sent out for 2 days, that turned into 8.

While I was not your wingman, I could have been and had to look to be sure.

another "Feature" that is frustrating is that the job I was No Reply on, was actually completed and uploaded 8 days prior, but sat in "Ready to Report" as client machine was turned off.
[Apr 19, 2010 10:03:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Do not think it helps to stock up large caches...

fredski,

not seeking to blame, just trying to learn and where possible make that 180 degrees so we can cut that average return time further down, within the system as it currently works.

As for the business trips etc, that falls in that first broad line of the opening post
... unless there was a technical glitch for not starting the job in time:

Those who see this most frequently are those in the < 2 days return group... these devices do the repair and make-up work and are often called to task in acting as verifiers for ínconclusives... no problem, but being the 3rd copy for a No Reply that still comes in... let's work to minimize those with a little help of each other.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Apr 19, 2010 10:25:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 39   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread