| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 12
|
|
| Author |
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
Exacting when the Repair job was received, maybe someone can proof how this could happen so quickly with the rules Ingleside presented. There were 14 valids returned from the last error (marked roman I-XIV, and 4 stuck in PV before the repair Job assigned [which got autostarted, with help of a 4300 minute switch time and a > 2 day cache, at that time) It would have been easier to follow the list if it was sorted the other way around, but it's fairly simple, and even more simple for you to do it... On web-page, select to see the result status, select to filter on "Invalid" result-status, and post this new list of tasks... Hang on, going by your list, you don't have any "invalid" results at all, meaning you don't have any validation-errors, and therefore host.error_rate wasn't increased at all. Now, I didn't post the rules before in this thread, but for a computer being "reliable" there's 3 fairly easy rules to follow: 1: host.error_rate less than 0.002 2: Average turnaround-time is less than 2 days (except uncommon platform). 3: Current host-quota = max_daily_quota. #1 is only updated by the Validator. #2 is also updated by the Validator, except if you misses a deadline, then it's updated by the Transitioner. #3 is updated by Transitioner and not the Validator. So, in practice this happens: a: Start: host.error_rate = 0.001; quota = 80 per cpu. b: Reports this result as "Error": E201966_ 426_ C.23.C17H10N2OSSeSi.00560402.4.set1d06_ 1-- 1479931 Error 4/28/11 15:51:26 4/30/11 06:42:06 0.51 9.3 / 0.0 The report triggers Transitioner... Leading to quota = 80 -1 = 79. Also, since it's an "error", the Transitioner obviously doesn't set the NEED_VALIDATE-flag, meaning host.error_rate = 0.001. c: Reports upto 39 more "errors" or "User Aborted", this reporting triggers Transitioner... Leading to quota = 79 - 39 = 40, while host.error_rate = 0.001. d: Reports your I-result: E201966_ 446_ C.24.C17H8N4OSSe.00539718.3.set1d06_ 0-- 1479931 Valid 4/28/11 15:52:12 4/30/11 15:25:15 6.67 121.3 / 121.9 (I) This was reported as a "Success", and this triggers Transitioner. Transitioner sees this is "Success", sets quota = 2 * 40 = 80. Also, if Transitioner sees wu now has atleast min_quorum "Success"-results, sets the NEED_VALIDATE-flag that triggers Validation. Validator sees the NEED_VALIDATE-flag, checks wu, and since validates, host.error_rate = 0.95 * 0.001 = 0.00095. Validator also updates avg.turnaround = avg.turnaround * 0.7 + 0.3 * (reported_time - sent_time). Bottom line is, upto 40 results reported as "error" or "User aborted" only affects the quota, and this effect is therefore cancelled by a single "Success"-report. A Validation-error, meaning a result marked as "Invalid" on the other hand, needs at least 19 "Valid" results to recover from before computer can be "reliable" again. ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
LoL, Ingleside
Frankly, reliability being impacted differently from ''Invalid'' v ''Error'' and ''Aborts'' (of any kind) is the one giant misunderstanding on my part, and having a hard time comprehending the logic of that. At any rate, I did not have a single ''invalid''... either the machine completes them to the end and they're good or they don't go to the end. Maybe knreed likes to comment on this. --//-- |
||
|
|
|