Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: FightAIDS@Home Phase 2 Thread: Recent Invalids |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 17
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Something strange?
I've just got a new machine and I'm running Ubuntu for the first time. It's been running Zika just fine, but it ran a couple of FAH2 WUs for the first time yesterday and I see this morning that they're both flagged Invalid. Looking at the wingmen it seems they got the same thing, though the logs look normal. Do we have some bad WUs out there or is there a problem with the validator, or is my new machine not up to scratch? Here they are: Project Name: FightAIDS@Home - Phase 2 Created: 12/08/2016 20:47:29 Name: FAH2_000819_zinc00078906_000001_0001_001 Minimum Quorum: 1 Replication: 3 Result Name OS type OS version App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit FAH2_ 000819_ zinc00078906_ 000001_ 0001_ 001_ 4-- Linux 3.10.0-327.36.3.el7.x86_64 - In Progress 09/12/16 16:56:16 13/12/16 16:56:16 0.00 0.0 / 0.0 FAH2_ 000819_ zinc00078906_ 000001_ 0001_ 001_ 3-- Linux 3.10.60-std441-amd64 - Detached 09/12/16 16:50:45 09/12/16 16:56:13 0.00 0.0 / 0.0 FAH2_ 000819_ zinc00078906_ 000001_ 0001_ 001_ 2-- Linux 3.16.0-4-amd64 - Detached 09/12/16 16:48:35 09/12/16 16:50:41 0.00 0.0 / 0.0 FAH2_ 000819_ zinc00078906_ 000001_ 0001_ 001_ 1-- Linux 4.4.0-31-generic 715 Invalid 09/12/16 16:48:27 10/12/16 07:12:48 12.45 403.8 / 0.0 FAH2_ 000819_ zinc00078906_ 000001_ 0001_ 001_ 0-- Linux 3.10.0-327.36.3.el7.x86_64 715 Invalid 08/12/16 20:47:45 09/12/16 16:48:09 19.32 445.5 / 0.0 Project Name: FightAIDS@Home - Phase 2 Created: 12/08/2016 20:47:28 Name: FAH2_000819_zinc00078906_000007_0006_001 Minimum Quorum: 1 Replication: 3 Result Name OS type OS version App Version Number Status Sent Time Time Due / Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit FAH2_ 000819_ zinc00078906_ 000007_ 0006_ 001_ 4-- Linux 3.10.0-327.36.3.el7.x86_64 - In Progress 10/12/16 05:05:48 14/12/16 05:05:48 0.00 0.0 / 0.0 FAH2_ 000819_ zinc00078906_ 000007_ 0006_ 001_ 3-- Linux 3.16.0-4-amd64 - In Progress 09/12/16 15:22:16 13/12/16 15:22:16 0.00 0.0 / 0.0 FAH2_ 000819_ zinc00078906_ 000007_ 0006_ 001_ 2-- Linux 4.8.0-1-amd64 715 Error 09/12/16 15:19:20 09/12/16 15:22:11 0.00 0.0 / 0.0 FAH2_ 000819_ zinc00078906_ 000007_ 0006_ 001_ 1-- Linux 4.4.0-31-generic 715 Invalid 09/12/16 15:19:00 10/12/16 05:05:43 11.69 379.2 / 0.0 FAH2_ 000819_ zinc00078906_ 000007_ 0006_ 001_ 0-- Linux 3.10.0-327.36.3.el7.x86_64 715 Invalid 08/12/16 20:47:45 09/12/16 15:18:07 17.81 411.7 / 0.0 |
||
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 766 Status: Offline Project Badges: |
Unlikely to be anything wrong with your PC.
----------------------------------------The few units I have had recently have been invalid for all. Paul.
Paul.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks, Paul.
Sounds like it's one for the techs to look at on Monday. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
You can apply a Boolean 'True or False ' test to the observation... from my hunting tool, just the raw 'show all fahb' filter applied, found 6, all 6 valid.
ResultName "App Name" Status Claimed CpuTime Time ElapsedTime Granted Crd/Hr "Efficien cy" Outcome ReceivedTime ServerState ValidateState Last Mod Stats_Period FAH2_000358_avx628080-ls_000067_0003_036_2 fahb Valid 312.05 21.8060 0:21:48:22 21.8721 312.05 14.31 99.70% 1 09-12-16 04:27:44 5 1 09-12-16 04:27:57 16-12-09 12:06 FAH2_000357_avx628073-ls_000007_0003_038_2 fahb Valid 306.06 21.7020 0:21:42:07 21.7717 306.06 14.10 99.68% 1 09-12-16 04:20:39 5 1 09-12-16 04:20:52 16-12-09 12:06 FAH2_000330_avx387872-ls_000068_0000_039_2 fahb Valid 340.44 24.4145 1:00:24:52 24.4762 340.44 13.94 99.75% 1 08-12-16 21:29:56 5 1 08-12-16 21:30:06 16-12-09 00:00 FAH2_000320_avx387847-ls_000062_0000_040_1 fahb Valid 311.64 22.6464 0:22:38:47 22.7168 311.64 13.76 99.69% 1 08-12-16 00:38:33 5 1 08-12-16 00:38:46 16-12-08 12:06 FAH2_000365_gl5243842-ls_000004_0000_040_2 fahb Valid 305.43 22.5329 0:22:31:59 22.5964 305.43 13.56 99.72% 1 07-12-16 21:46:32 5 1 07-12-16 21:46:42 16-12-08 00:00 FAH2_000367_pc2a030-ls_000061_0003_040_2 fahb Valid 295.81 22.1776 0:22:10:39 22.2361 295.81 13.34 99.74% 1 07-12-16 21:18:28 5 1 07-12-16 21:18:39 16-12-08 00:00 as in, there's devices able to complete them successfully and until then, max copies tried 5 is applied. W8/W10 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
So it seems the problem is Linux-only. Perhaps even Linux 64-bit only.
Anyone else with comments? |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 881 Status: Offline Project Badges: |
FAH2_000819 tasks (and higher) are actually new tasks which were presumably going to be held up until the "asynchronous replica exchange" version of FAH2 went live, but some have been allowed out... Perhaps there's something wrong with the data files for some of these?
It's mentioned on the status page at Temple. (And that looks as if there'll be plenty of work when they do restart properly!) By the way, I haven't seen an FAH2 task for quite a while, and certainly not one from any experiments numbered 819 or higher, so I've had no first-hand experience of these errors. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks for highlighting the experiment addition through batch 1318. It does say
"Note that due to the upcoming release of asynchronous replica exchange, these simulations may be cut short and rerun with the new protocols. " i.e. second thoughts must have arisen to not release them under the old crunching regimen. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sounds like a beta that wasn't. At least I can stop worrying about my new machine (for the moment, anyway).
Thanks for the info everyone. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
819's on Windows, all invalid!
|
||
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: |
The invalids are being caused by a workunit issue. We have stopped validation for now so no new copies are sent out. We will have to clean things up on the server then get them going again. We will update this thread as we progress.
Thanks, armstrdj |
||
|
|