Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 117
|
![]() |
Author |
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 325 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This problem of pVal -> pVer -> invalid is not consistent.
After overnight shutdown on Win 7 64 bit, i7-2600k I noticed the following taking the above route: 21/11/2013 7 out of 7 22/11/2013 0 out of 7 23/11/2013 0 out of 7 24/11/2013 0 out of 7 25/11/2013 7 out of 7 26/11/2013 at least 4 out of 7 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
All my tasks in last 4-5 days are pending verification and after invalid.
----------------------------------------BOINC manager 7.2.28, xp and cpu intel... [Edit 1 times, last edit by Former Member at Nov 26, 2013 11:25:11 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yes, there's more cooking under the hood. Without feedback from techs we're tapping in the dark. Not had any since the 20th... since keeping LAIM on. Under v7.24 we may have had some duplication, but if v7.26 leads to results within a distributed task not seeking for same outcome [Sgt. Joe saw 5 and all computing a different result.out], the new random number generator may not be laying out the same paths. [gross speculation]
Anyone seen an Invalid/ all Too Late quorum where more than 1 invalid had a single wcg_learn_limit = 500000 entry? Anyway, the ball is in tech court. More reporting does not seem to add new insight at this time. |
||
|
johncmacalister2010@gmail.com
Veteran Cruncher Canada Joined: Nov 16, 2010 Post Count: 799 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Yes, there's more cooking under the hood. Without feedback from techs we're tapping in the dark. Not had any since the 20th... since keeping LAIM on. Under v7.24 we may have had some duplication, but if v7.26 leads to results within a distributed task not seeking for same outcome [Sgt. Joe saw 5 and all computing a different result.out], the new random number generator may not be laying out the same paths. [gross speculation] Anyone seen an Invalid/ all Too Late quorum where more than 1 invalid had a single wcg_learn_limit = 500000 entry? Anyway, the ball is in tech court. More reporting does not seem to add new insight at this time. Hi, Rob: I understand very little of the technical information in the Forum. My question is: Should I keep my 4 core PC processing MCM work 24/7 - will the results (P Ver, PVal, Inval) be used for the project? Thanks, JCM ![]() crunching, crunching, crunching. AMD Ryzen 5 2600 6-core Processor with Windows 11 64 Pro. AMD Ryzen 7 3700X 8-Core Processor with Windows 11 64 Pro (part time) ![]() [Edit 1 times, last edit by John C MacAlister at Nov 26, 2013 3:33:04 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The techs have not hit the red button yet, guessing the Thanksgiving Turkey is not overdone yet. [If the 'Invalid' is less than x% of total computed time, there wont be red button pushing]. Will go on, and consider the slop as still within bounds and the bulk being proper with just 2 copies.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Finally caught 2 sets in the act, each with all copies having the same result.out value, one run non-stop, the other(s) with a restart, and passing to PVer:
----------------------------------------MCM1_ 0000303_ 7782_ 2-- - In Progress 11/26/13 19:00:45 12/6/13 19:00:45 0.00 0.0 / 0.0 MCM1_ 0000303_ 7782_ 0-- 726 Pending Verification 11/26/13 03:55:44 11/26/13 19:00:12 4.25 100.7 / 0.0 MCM1_ 0000303_ 7782_ 1-- 726 Pending Verification 11/26/13 03:55:28 11/26/13 16:12:51 5.26 131.8 / 0.0 Result Name: MCM1_ 0000303_ 7782_ 1-- <core_client_version>7.2.7</core_client_version> <![CDATA[ <stderr_txt> Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.26_x86_64-pc-linux-gnu -SettingsFile MCM1_0000303_7782.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Result.out = 3319202.000000 Run complete, CPU time: 18918.634345 14:06:14 (14473): called boinc_finish </stderr_txt> ]]> Result Name: MCM1_ 0000303_ 7782_ 0-- <core_client_version>7.2.33</core_client_version> <![CDATA[ <stderr_txt> Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.26_x86_64-pc-linux-gnu -SettingsFile MCM1_0000303_7782.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.26_x86_64-pc-linux-gnu -SettingsFile MCM1_0000303_7782.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Result.out = 3319202.000000 Run complete, CPU time: 15303.252096 19:59:34 (32125): called boinc_finish </stderr_txt> ]]> And the second set, in fact all PVer copies with the same result.out value of Result.out = 1610452.000000. After last came back, with same Result.out, all were marked Too Late. MCM1_ 0000284_ 4511_ 4-- 726 Too Late 11/26/13 18:43:04 11/27/13 13:23:10 3.03 60.6 / 0.0 MCM1_ 0000284_ 4511_ 3-- 726 Too Late 11/26/13 04:13:00 11/26/13 18:42:50 3.49 82.0 / 0.0 MCM1_ 0000284_ 4511_ 2-- - Detached 11/26/13 03:11:11 11/26/13 04:12:14 0.00 0.0 / 0.0 MCM1_ 0000284_ 4511_ 0-- 726 Too Late 11/25/13 06:07:24 11/25/13 21:14:26 3.31 67.0 / 0.0 MCM1_ 0000284_ 4511_ 1-- 726 Too Late 11/25/13 06:07:14 11/26/13 03:10:35 4.38 87.9 / 0.0 One more copy and it will [likely] convert to Too Late. So what's cooking under that bonnet... Mind Games? edit: Adding a 3rd set with identical Result.out = 3575659.000000, 1 restart, the other not. MCM1_ 0000305_ 5820_ 2-- - In Progress 11/26/13 22:34:07 12/6/13 22:34:07 0.00 0.0 / 0.0 MCM1_ 0000305_ 5820_ 0-- 726 Pending Verification 11/26/13 05:41:49 11/26/13 22:33:47 4.94 118.0 / 0.0 MCM1_ 0000305_ 5820_ 1-- 726 Pending Verification 11/26/13 05:41:29 11/26/13 14:19:05 5.06 134.1 / 0.0 edit2: Adding a 4th set with identical Result.out = 3273739.000000 1 restart, the other clean runthrough. MCM1_ 0000304_ 7118_ 2-- - In Progress 11/27/13 15:39:11 12/7/13 15:39:11 0.00 0.0 / 0.0 MCM1_ 0000304_ 7118_ 1-- 726 Pending Verification 11/26/13 04:09:51 11/27/13 15:38:31 2.36 175.5 / 0.0 MCM1_ 0000304_ 7118_ 0-- 726 Pending Verification 11/26/13 04:09:50 11/26/13 19:09:50 4.24 100.6 / 0.0 [Edit 2 times, last edit by Former Member at Nov 27, 2013 6:05:26 PM] |
||
|
marvey11
Advanced Cruncher Germany Joined: Apr 2, 2011 Post Count: 89 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As far as I can see, all my Invalids come from results that have been restarted because the computers they were running on are turned off during the night. This does not seem to be limited to a certain OS, either. But not all the restarted results will go to Invalid via PVer but go Valid instead, so there must be some other reason apart from the restart...
----------------------------------------On the other hand, all the computers running 24/7 -- all with LAIM on -- produce only Valids. ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Having seen now all possible permutations of how Too Late and Invalid come about, my coin is on a validation rule / files not concatenating properly on restart. Can't remember having seen Invalid come about if 2 have not restarted and both have same Result.out.
----------------------------------------FTM, I've moved off MCM. The sum ratio of my devices is well above 5% getting the thumbs down mark [to include each time we have a power cycle which is frequent], and including the Too Late ending. As it's TG in the USA, not expecting action until next Monday earliest. [Edit 1 times, last edit by Former Member at Nov 28, 2013 3:07:13 PM] |
||
|
kffitzgerald
Senior Cruncher USA Joined: Jan 29, 2011 Post Count: 222 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
everything looks normal here, 0 errors, 0 invalids - 3 pages each of Pverifications & PValidation - running all MCM's on two I7-2500's on WS2008r2 - NOT overclocking
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm with Rob on this one, though I do think something has changed since somewhere around the 25th.
I have a pwned machine in the family that's not usually on for more than a few hours at a time, maybe two or three times a week. It's had 7 valid results since the MCM1 project started, including two on 25/Nov, that I can no longer see in Results Status. On 26/Nov it returned a result (MCM1_ 0000295_ 9860_ 1) which included a cold restart and it was declared invalid, even though the Result.out value matched both wingmen whose results were declared valid. It returned another on the same day, also with a restart, which is still in PVal. It will be interesting to see what happens to that one. |
||
|
|
![]() |