Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 10
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The outlier detection algorithm which has been so successfully implimented on the other projects does not appear to be operating correctly on the Genome comparison project. e.g.
10000318-10000593_ Valid 12/28/2006 19:34:36 12/29/2006 02:03:25 2.43 17 / 25 10000318-10000593_ Valid 12/21/2006 19:16:20 12/24/2006 09:56:51 3.17 17 / 25 10000318-10000593_ Valid 12/21/2006 19:16:15 12/21/2006 21:48:40 2.43 40 / 25 10000318-10000593_ Valid 12/21/2006 19:15:37 12/29/2006 07:47:33 2.34 16 / 25 and 10000546-10000604_ Valid 12/30/2006 03:51:29 01/02/2007 14:35:43 3.85 21 / 28 10000546-10000604_ Valid 12/30/2006 03:50:15 12/30/2006 18:12:35 4.55 32 / 28 10000546-10000604_ Valid 12/30/2006 03:49:50 01/01/2007 13:44:14 6.15 32 / 28 Could a Tech please look at this before the eruption starts? Cheers. ozylynx ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
For one of the projects (HDC) it was disabled due to the 'restarts' that continued to accumulate CPU time while loosing the % progress and thus would become Outlier. Do not remember a message that indicated that the problem was resolved and the rule suspension reversed. Not aware it was disabled for GC. Will bring it to the attention of the techs.
----------------------------------------http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=10243
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks Sekerob
I don't run HDC so hadn't seen that. Cheers. ozylynx ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm not sure those are outliers.
Remember that the algorithm only takes into account the first three past the post - so in your first example, if the first had been 17, 17, 16 then they would have been *very* tightly grouped, and 40 would be a clear outlier (late returns simply get the precalculated credit - they are never penalised). But the first three were 17, 40, 17. This gives an average of 25, and none are sufficiently far from the average to cause a problem. Similarly with the second example. I could crunch the numbers using the internal algorithm details, but I see no reason to worry here. If you want to work out the averages and deviations yourself, please go ahead. Just bear in mind that WCG don't publish the actual cutoff values. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Didy
I see your point. If the suspect result is allowed to influence the average the percentages will swing in that direction. If, on the other hand, realizing that I obviously know one of the machines in each of these quorums is consistently 'on the money', and the suspect machine is compared to the others and not the average, the first instance shows a claim of more than 200% surely an outlier even if not a far outlier??? This is merely given greater weight by the presence of the fourth result. Perhaps the small number of points involved is misleading. So for the exercise, let us look at UD points. Now we have 119, 280, 119, (102). The 280 kinda stands out from where I sit. Note that the high claiming machine was neither significantly faster or slower than the others in the quorum. That is an important issue. The second quorum, likewise, shows 2 machines claiming 50%+ more than the third and although this is probably marginal, imho, it could be considered a low outlier and thus should not be used in the calculation of quorum average score. UD again makes it a little clearer and low outliers are obviously never as dramatic....147, 224, 224. BTW both of these examples are results taken from the same machine and the averages reflect differences in time taken to complete the task. Note again that the low claim machine is significantly faster than quorum average and is therefore obviously one of the 'new breed' of machines which are seemingly always low claimers. It would be worth noting that in the vast majority of cases with GC project results, High claim to Low claim within the same quorum, rarely exceeds 3 points. Cheers. ozylynx ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Knreed has confirmed my diagnosis, and shared the current algorithm with the CAs. It's too complicated to go into in detail but is broadly as I described.
Statistical analysis isn't really a simple subject, and it's easy to find a result that "doesn't look right". Also, hindsight would let us distribute points a little more accurately, but the thing about hindsight is it comes too late. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
10000069-10000731_ -- Valid 01/09/2007 10:29:52 01/09/2007 20:13:34 2.17 26 / 16
10000069-10000731_ -- Valid 01/09/2007 10:29:07 01/13/2007 10:01:51 3.33 10 / 16 10000069-10000731_ -- Valid 01/09/2007 10:28:54 01/12/2007 03:25:11 1.05 11 / 16 Frankly I care not why. Only that it IS when it should be not. Cheers. ozylynx ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Yesterday had a quorum as follows on a FAAH:
----------------------------------------109 85 < Moi 74 The quorum was 79..... 109 considered an outlier @ about 25%, not extreme, thus getting 100% of the remaining median....79. Ozylynx sampled GC of 26 is 150% off the median, which was even closer on the remaining 2.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Jan 15, 2007 9:45:46 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
That'd be 250% Sek.
Cheers. ozylynx ![]() |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
My inglese is not so good Ozy... 150% off where it was intended to say 'more'.
----------------------------------------mea massima culpa :D
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Jan 15, 2007 1:12:42 PM] |
||
|
|
![]() |