Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 10
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 844 times and has 9 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Outlier Detection not working on Genome Comparison Project

The outlier detection algorithm which has been so successfully implimented on the other projects does not appear to be operating correctly on the Genome comparison project. e.g.

10000318-10000593_ Valid 12/28/2006 19:34:36 12/29/2006 02:03:25 2.43 17 / 25
10000318-10000593_ Valid 12/21/2006 19:16:20 12/24/2006 09:56:51 3.17 17 / 25
10000318-10000593_ Valid 12/21/2006 19:16:15 12/21/2006 21:48:40 2.43 40 / 25
10000318-10000593_ Valid 12/21/2006 19:15:37 12/29/2006 07:47:33 2.34 16 / 25

and

10000546-10000604_ Valid 12/30/2006 03:51:29 01/02/2007 14:35:43 3.85 21 / 28
10000546-10000604_ Valid 12/30/2006 03:50:15 12/30/2006 18:12:35 4.55 32 / 28
10000546-10000604_ Valid 12/30/2006 03:49:50 01/01/2007 13:44:14 6.15 32 / 28

Could a Tech please look at this before the eruption starts?

Cheers. ozylynx smile
[Jan 3, 2007 4:45:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

For one of the projects (HDC) it was disabled due to the 'restarts' that continued to accumulate CPU time while loosing the % progress and thus would become Outlier. Do not remember a message that indicated that the problem was resolved and the rule suspension reversed. Not aware it was disabled for GC. Will bring it to the attention of the techs.

http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=10243
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jan 3, 2007 4:55:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

Thanks Sekerob

I don't run HDC so hadn't seen that.

Cheers. ozylynx smile
[Jan 3, 2007 5:09:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

I'm not sure those are outliers.

Remember that the algorithm only takes into account the first three past the post - so in your first example, if the first had been 17, 17, 16 then they would have been *very* tightly grouped, and 40 would be a clear outlier (late returns simply get the precalculated credit - they are never penalised). But the first three were 17, 40, 17. This gives an average of 25, and none are sufficiently far from the average to cause a problem.

Similarly with the second example. I could crunch the numbers using the internal algorithm details, but I see no reason to worry here.

If you want to work out the averages and deviations yourself, please go ahead. Just bear in mind that WCG don't publish the actual cutoff values.
[Jan 3, 2007 10:50:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

Hi Didy

I see your point. If the suspect result is allowed to influence the average the percentages will swing in that direction. If, on the other hand, realizing that I obviously know one of the machines in each of these quorums is consistently 'on the money', and the suspect machine is compared to the others and not the average, the first instance shows a claim of more than 200% surely an outlier even if not a far outlier??? This is merely given greater weight by the presence of the fourth result.

Perhaps the small number of points involved is misleading. So for the exercise, let us look at UD points. Now we have 119, 280, 119, (102). The 280 kinda stands out from where I sit. Note that the high claiming machine was neither significantly faster or slower than the others in the quorum. That is an important issue.

The second quorum, likewise, shows 2 machines claiming 50%+ more than the third and although this is probably marginal, imho, it could be considered a low outlier and thus should not be used in the calculation of quorum average score. UD again makes it a little clearer and low outliers are obviously never as dramatic....147, 224, 224. BTW both of these examples are results taken from the same machine and the averages reflect differences in time taken to complete the task. Note again that the low claim machine is significantly faster than quorum average and is therefore obviously one of the 'new breed' of machines which are seemingly always low claimers.

It would be worth noting that in the vast majority of cases with GC project results, High claim to Low claim within the same quorum, rarely exceeds 3 points.

Cheers. ozylynx smile
[Jan 4, 2007 3:28:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

Knreed has confirmed my diagnosis, and shared the current algorithm with the CAs. It's too complicated to go into in detail but is broadly as I described.

Statistical analysis isn't really a simple subject, and it's easy to find a result that "doesn't look right". Also, hindsight would let us distribute points a little more accurately, but the thing about hindsight is it comes too late.
[Jan 4, 2007 3:51:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

10000069-10000731_ -- Valid 01/09/2007 10:29:52 01/09/2007 20:13:34 2.17 26 / 16
10000069-10000731_ -- Valid 01/09/2007 10:29:07 01/13/2007 10:01:51 3.33 10 / 16
10000069-10000731_ -- Valid 01/09/2007 10:28:54 01/12/2007 03:25:11 1.05 11 / 16

Frankly I care not why. Only that it IS when it should be not.

Cheers. ozylynx smile
[Jan 15, 2007 3:36:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

Yesterday had a quorum as follows on a FAAH:

109
85 < Moi
74

The quorum was 79..... 109 considered an outlier @ about 25%, not extreme, thus getting 100% of the remaining median....79. Ozylynx sampled GC of 26 is 150% off the median, which was even closer on the remaining 2.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jan 15, 2007 9:45:46 AM]
[Jan 15, 2007 9:40:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

That'd be 250% Sek.

Cheers. ozylynx smile
[Jan 15, 2007 12:51:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Outlier Detection not working on Genome Comparison Project

My inglese is not so good Ozy... 150% off where it was intended to say 'more'.

mea massima culpa

:D
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jan 15, 2007 1:12:42 PM]
[Jan 15, 2007 12:54:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread