Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 2
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1178 times and has 1 reply Next Thread
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Suggestion to deal with continuous task failures

Suggestion of a failure control mechanism based on the introduction of Result Status tallies on a per project basis.

I quite often see posts about what I describe as run-away-failures; where tasks from one project continuously fail on systems. These are often the result of non-WCG processes, but none the less effect some projects more than others. My suggestion is that the servers keep an Error Tally list by system and when the error number is too great, the server sends an email to the cruncher to advise of the problem and only sends other task types.

The email could notify the cruncher that their project selection has been changed (if they only selected one project) to run different tasks because that project’s fail rate on that system was too high, and that they should restart their system (often resolves this situation). If the cruncher has more than one project, then they would just stop crunching one of their chosen projects until the cruncher selects it again (following a system restart, hopefully).
[Nov 30, 2010 2:27:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Suggestion to deal with continuous task failures

The book of knreed dreams has a feature listed to automatically stop sending a specific science to a client if it always fails, accompanied by a red message being logged in the client. With client 6.12 that could be given form through the popup feature of the new Notices window of BOINC. WCG would then send once in a while a trial task of that constantly failing science app to check if the problem was resolved.

--//--
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Nov 30, 2010 3:58:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread