Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 51
Posts: 51   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 63147 times and has 50 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

to whoever fixed the problem... well done and thanks!
----------------------------------------
[Edit 1 times, last edit by Former Member at Dec 10, 2013 1:54:13 PM]
[Dec 10, 2013 1:53:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

New tasks flowing.....many thanks biggrin
[Dec 10, 2013 1:55:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
CandymanWCG
Senior Cruncher
Romania
Joined: Dec 20, 2010
Post Count: 421
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

All's well when it ends well. I got myself a couple of shiny Betas! cool
----------------------------------------
Knowledge is limited. Imagination encircles the world! - Albert Einstein



[Dec 10, 2013 2:23:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

Here is the post describing the outage: https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,35980

As for comments with regards to 24x7 support, etc. All programs and systems should be evaluated and reviewed against what level of uptime is desired. As you go from a target of 95% -> 98.5% -> 99% -> 99.9% -> 99.999% the costs gets significantly more expensive as you increase the target uptime. IBM provides us with a solid budget to run this program. Within that budget, have to decide how much we are going to spend on redundancy for server infrastructure, how much to spend on manpower for support and responding to incidents, how much to spend onboarding more research projects, how much to spend developing the website and how much to spend responding to emails, forums, social media, etc.

Our target for uptime at the application layer for World Community Grid is 99.0%. This means that each year our goal (excluding planned maintenance) is to be available 8,672 out of 8,760 hours. We have usually been closer to 99.5%. Early this year we had 3 incidents that caused some extended downtown and we are unfortunately going to be close to 99.0% this year. Note that the hosting infrastructure has a 24x7 staff and has higher availability targets.

We do not like having outages and we work to keep the system at a high level of availability. However, we do feel that the target of 99.0% availability is the right balance for the use of our budget on this project.

For those of you frustrated by having your machines idled for part of this time last night, I encourage you to learn about the ability to control how much work is buffered on your devices. You can instruct the client to store X hours of work on your machine so that you will have a supply of work to run locally during events such as this. For those of you new to us, outages of this duration are quite rare and a buffer of 8-12 hours is the most that I would recommend to store.
[Dec 10, 2013 2:39:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
CandymanWCG
Senior Cruncher
Romania
Joined: Dec 20, 2010
Post Count: 421
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

Hi Kevin,

Thank you for the detailed explanation and updates. Indeed, 99% up time is more than reasonable as it is and I believe many of us understand the reasons behind it, especially with the facts that you have provided.

Not to beat a dead horse here, but is there any chance that some scripts or notifications could be set in place to send some warning to you techs so if the situation calls for it and you don't mind getting out of bed at strange hours, you can quickly fix it?

Regarding the cache, I'm sure that there are many of us that for one reason or the other we need to either keep a low or 0 cache, so it's not really about learning how to use this very nice feature it's the various factors that prevent us from doing so.

Anyway, thanks again for your support and great work! applause
----------------------------------------
Knowledge is limited. Imagination encircles the world! - Albert Einstein



----------------------------------------
[Edit 1 times, last edit by CandymanWCG at Dec 10, 2013 2:50:25 PM]
[Dec 10, 2013 2:49:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

Not to beat a dead horse here, but is there any chance that some scripts or notifications could be set in place to send some warning to you techs so if the situation calls for it and you don't mind getting out of bed at strange hours, you can quickly fix it?


We do in fact have many scripts and alerts in place. Not sure if you have ever been on support before, but it is very hard to tune monitoring to only notify in the event of errors but not send false alerts. In any system like this you get a number of false alerts for every real alert. This means that you actually need someone 'on-call' to respond and determine if an alert is a real issue or a false positive. We can only ensure that someone will repsond to an issue at the next scheduled work interval.

Having said that, we frequently check them out when we see them day or night, weekend or weekday. Last night just happened to be the convergence of several factors where no-one was able to check them out until this morning (US time) and the issue started relatively early in the night.
[Dec 10, 2013 3:05:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
CandymanWCG
Senior Cruncher
Romania
Joined: Dec 20, 2010
Post Count: 421
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects

Crystal clear. Many thanks for your reply and patience!

Cheers! peace
----------------------------------------
Knowledge is limited. Imagination encircles the world! - Albert Einstein



[Dec 10, 2013 3:07:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects [RESOLVED]

Whilst away, fetching an I7-4770 DT 3.4-3.9Ghz to replace the ol Q6600 2.4Ghz , and WCG coming back in me absence, the first thing was for the clients to load up in Beta, though I'd set the profiles back for each device to not seek them out in preference. Great. Seems after initially receiving only faah, the FAHV are flowing again too.

applause
[Dec 10, 2013 3:50:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Steve W
Advanced Cruncher
Joined: Dec 9, 2005
Post Count: 110
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects [RESOLVED]

Just looking at my machine logs and getting the "Project has no tasks available" again.

I'm hoping that its just down to someone snaffling all the existing WU and not the feeder dying again.
[Dec 10, 2013 5:24:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: No Tasks Available - All Projects [RESOLVED]

It's hard to quantify impacts of these events, but we know that whoever had work, managed to report it and met with wingman if there was one. Just no buffer backfill. For now the data says the Tuesday morning validations were 290 years worth [a record], and the afternoon 225, a differential of -65. Whence the machines started receiving work again, when asking again [one of mine having hit on a 15 hour back-off], any task completed would have a higher chance of finding a wingman in the initial 12-24 hours... drained cache machines are in sync on the first jobs they do, so it will be interesting to see what the Wednesday morning will bring... more / less? For the moment my PV exploded... from 86 yesterday to 108 now. We'll know in a little. Place you bets.
[Dec 11, 2013 12:27:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 51   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread