Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Cure Muscular Dystrophy - Phase 2 Forum Thread: Invalid results in new version |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 37
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The latest beta resulted in a significant (10%-ish) proportion of WUs marked invalid for me. Now that the new version (613) is in operation, I've seen my first invalid non-beta HCMD2 WU.
CMD2_ 0002-MYH6.clustersOccur-TBAKA.clustersOccur_ 320_ 2-- 613 Valid 27/05/09 06:33:04 27/05/09 15:25:48 1.63 27.9 / 27.9 CMD2_ 0002-MYH6.clustersOccur-TBAKA.clustersOccur_ 320_ 1-- 613 Invalid 26/05/09 21:15:55 27/05/09 02:13:52 2.32 17.0 / 8.5 CMD2_ 0002-MYH6.clustersOccur-TBAKA.clustersOccur_ 320_ 0-- 613 Valid 26/05/09 21:15:46 27/05/09 06:30:54 1.78 16.6 / 27.9 Result Log said: <core_client_version>5.10.45</core_client_version> <![CDATA[ <stderr_txt> INFO: Initializing Platform. INFO: No state to restore. Start from the beginning. called boinc_finish </stderr_txt> ]]> All other result logs were identical, except for client version number. |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
Kremmen,
I only see one invalid result for your computers (and I see all of your older Linux machines returning good results with the new version for Linux). I also see lots of valid results for HCMD2 for the computer with the invalid result. The beta testing did not show any issue with a high invalidation rate on any platform. I am not seeing a high invalid rate for you on HCMD2 (I did not look at your beta results). Are you continuing to see issues with HCMD2 in production? thanks, Kevin |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The beta testing did not show any issue with a high invalidation rate on any platform. I am not seeing a high invalid rate for you on HCMD2 (I did not look at your beta results). Are you continuing to see issues with HCMD2 in production? My machines are all running standard (or under) clock speeds. I would consider 1 invalid WU per month to be most unusual, so several in a day is unusually high by many orders of magnitude. My 613 betas were invalid about 10% of the time in the last day or so of the beta run, but almost all of them have fallen off the results list now. I have only completed 10 version 613 WUs: 9 Valid and 1 Invalid. That's continuing the 10% invalid proportion from the beta. [Edit 1 times, last edit by Former Member at May 28, 2009 2:27:58 PM] |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Really, seriously, 1 in the first 10 is not a statistical significance. 10 in 100 is.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Really, seriously, 1 in the first 10 is not a statistical significance. 10 in 100 is. Let's assume we have 2 data sets. Set A (HCMD2 611) has 0/90 members invalid. Set B (HCMD2 beta 613) has 9/90 invalid. When another set of observations comes along and 1/10 members are invalid, which group would you say it more closely resembles, A or B? Sure, it's early days yet, but I'm guessing B. (Since my fastest machine is now 16 hours into a single position calculation, it might be a while before I have new data.) |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
Actual stats for HCMD2 using version 6.13:
----------------------------------------Platform Success Error Valid Invalid Pending Inconclusive % Potential Invalid % potential invalid = 1/2 inconclusive + invalid / (valid + invalid + inconclusive) A number of the higher invalid rates on Linux are due to 1 machine (for Linux AMD) and 1 user (for Linux Intel) that are returning garbage results and because it appears that results returned with 611 do not match 613 (likely due to floating point differences after changing the compiler options). Due to the much higher than normal limits for # workunits, it is taking longer than normal to limit the number of workunits the troublesome machines get. [Edit 2 times, last edit by knreed at May 29, 2009 3:53:16 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm now running 15% invalid on the new version. (17 valid, 3 invalid ... plus 3 inconclusive.)
----------------------------------------19.6% "Potential Invalid" under 613. 0% Potential Invalid under 611. The 2 new failures are: CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 8174_ 2-- 613 Valid 29/05/09 18:19:10 30/05/09 00:30:50 5.18 105.6 / 101.3 CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 8174_ 1-- 613 Valid 28/05/09 03:16:08 29/05/09 07:36:31 6.73 97.1 / 101.3 CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 8174_ 0-- 613 Invalid 28/05/09 03:15:02 28/05/09 15:34:24 10.70 68.3 / 34.2 CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 8206_ 2-- 613 Valid 29/05/09 18:19:48 29/05/09 19:39:36 1.32 15.3 / 13.6 CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 8206_ 0-- 613 Valid 28/05/09 03:16:26 28/05/09 16:17:01 2.40 11.9 / 13.6 CMD2_ 0002-RADIA.clustersOccur-RADIA.clustersOccur_ 8206_ 1-- 613 Invalid 28/05/09 03:15:02 29/05/09 02:30:28 4.98 31.8 / 6.8 P.S. Maybe it'd be easier to go back to the short beta units and work out why so many of those were invalid? e.g. BETA_ CMD2_ 0001-TNR1AA.clustersOccur-WWP1A.clustersOccur_ 38_ 47900_ 48242_ 3-- 613 Valid 26/05/09 02:57:38 26/05/09 03:31:36 0.44 4.6 / 6.6 BETA_ CMD2_ 0001-TNR1AA.clustersOccur-WWP1A.clustersOccur_ 38_ 47900_ 48242_ 2-- - No Reply 24/05/09 13:17:28 26/05/09 01:17:28 0.00 0.0 / 0.0 BETA_ CMD2_ 0001-TNR1AA.clustersOccur-WWP1A.clustersOccur_ 38_ 47900_ 48242_ 1-- 613 Valid 24/05/09 13:17:05 24/05/09 13:50:25 0.27 6.6 / 6.6 BETA_ CMD2_ 0001-TNR1AA.clustersOccur-WWP1A.clustersOccur_ 38_ 47900_ 48242_ 0-- 613 Invalid 24/05/09 13:16:31 24/05/09 14:56:11 1.64 6.6 / 6.6 [Edit 1 times, last edit by Former Member at May 31, 2009 3:29:28 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
it appears that results returned with 611 do not match 613 (likely due to floating point differences after changing the compiler options). That's rather ugly for the pile of 611 PV results I have. Excluding those, my stats are now: 30/5: 23% invalid, 25% "Potential Invalid" 31/5: 33% invalid, 36% "Potential Invalid" 1/6: 28% invalid, 32% "Potential Invalid" 2/6: 43% invalid, 44% "Potential Invalid" This is happening to every Intel (Celeron/P3/P4) machine I'm running, but in different proportions (faster machines get less invalid markers). My single Athlon machine appears to be unaffected. [Edit 3 times, last edit by Former Member at Jun 2, 2009 2:43:22 AM] |
||
|
ziegenmelker
Cruncher Joined: Feb 4, 2007 Post Count: 13 Status: Offline |
I get invalid results too.
Host is running Linux without any problems with other projects. 52 results total. 13 in progress 30 valid 6 pending 3 invalid cu, Michael |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
All of the CMD2-6.11 WU's that were processed on 3 normally very reliable Linux/AMD systems are being invalidated by 6.13 wingmen when the original 6.11 wingman fails to return the WU. An example:
CMD2_ 0001-2K2R_ A.clustersOccur-2O72_ A.clustersOccur_ 729_ 4-- 613 Valid 5/28/09 22:43:27 5/29/09 06:04:31 1.27 12.1 / 16.5 CMD2_ 0001-2K2R_ A.clustersOccur-2O72_ A.clustersOccur_ 729_ 3-- 613 Error 5/28/09 18:32:41 5/28/09 22:18:22 0.00 0.0 / 0.0 CMD2_ 0001-2K2R_ A.clustersOccur-2O72_ A.clustersOccur_ 729_ 2-- 613 Valid 5/28/09 01:24:55 5/28/09 18:24:33 1.29 16.5 / 16.5 CMD2_ 0001-2K2R_ A.clustersOccur-2O72_ A.clustersOccur_ 729_ 1-- 611 Invalid 5/13/09 19:26:30 5/14/09 15:45:26 0.78 11.6 / 5.8 <===Mine CMD2_ 0001-2K2R_ A.clustersOccur-2O72_ A.clustersOccur_ 729_ 0-- 611 Aborted 5/13/09 19:26:15 5/28/09 18:34:29 0.00 0.0 / 0.0 To date my results page shows 18 invalidated as described with a potential 44 additional invalids now pending. Conversely, several WU's with 6.11 validations have invalidated 6.13 wingmen on the same machines. Another example: CMD2_ 0001-GPDAA.clustersOccur-MYH6.clustersOccur_ 1614_ 2-- 613 Invalid 5/26/09 21:04:29 5/27/09 03:46:57 0.83 11.8 / 5.9 CMD2_ 0001-GPDAA.clustersOccur-MYH6.clustersOccur_ 1614_ 1-- 611 Valid 5/12/09 17:22:07 5/13/09 07:53:46 0.86 12.7 / 12.2 <===Mine CMD2_ 0001-GPDAA.clustersOccur-MYH6.clustersOccur_ 1614_ 0-- 611 Valid 5/12/09 17:18:22 5/28/09 18:34:06 0.92 11.8 / 12.2 My W2k/Intel system validated all CMD2 WU's regardless of app version. Unfortunately, my Linux/Intel box did not receive any 6.11 WU's so all WU's from it have validated fine with 6.13. I realize that 6.11 is no longer used and that 6.13 will soon be replaced with a newer version but does this indicate their might have been a small AppVer/OS/CPU compatibility issue? It will be a shame if those remaining 44 6.11 WU's now pending all go invalid. |
||
|
|