Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 2
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 551 times and has 1 reply Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Errors that affect reliable computers

I believe that the server cannot differentiate between the causes of Errors? I note that a reliable computer must have the last 15 results valid. We have had big problems where lots of errors were returned for a project for a while, so every computer returning an error would be removed from the Reliable computer list until at least another 15 valid results were returned.

The recommendation to uncheck HPF2 until the problems were resolved are extremely important. As long as the system still has enough Reliable computers to handle the Rush jobs I guess there is no problem. Theoretically the number of Reliable computers could drop dramatically since they commonly have short connect times, small buffers and multiple cores, all of which serve to increase the negative impact of a single error returned.
[Sep 25, 2008 12:43:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Errors that affect reliable computers

The importance that individuals attach to "reliable" in order to get "rush" jobs is severely overrated. No special credits are awarded for processing them. The ratio of success and failure in the daily total of ~235,000 tasks distributed is so small that WCG has no trouble at all to find devices to process "rush/repair" jobs even with the 21 hour rule that has been standing for quite some time.

That said, yes WCG does know what's a result of one or the other error, yesterday "Aborted" added as condition to the Result Status page conditions (you see them only when you did abort/cancel jobs.... they are currently included in the "error" filter). Aborts happen for various reasons like "project detach/attach" same as WCG instructing remotely, upon client initiated contact, if a job is redundant, so it is taken into consideration when e.grating devices.

Eventually, the techs do reset device ratings if found incorrect due to 'exceptional' events and circumstances, so all in all nothing to worry about.

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Sep 25, 2008 3:30:58 PM]
[Sep 25, 2008 3:25:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread