Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Discovering Dengue Drugs - Together - Phase 2 Forum Thread: frequent Inconclusives and Invalids in this batch of WUs? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 18
|
Author |
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
This batch of WUs seems to be producing more disagreements between wingmen's results than I usually see in DDDT2 and other projects. Are others seeing the same thing?
----------------------------------------Here's an example: Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit ts02_ a137_ pr78a1_ 2-- - In Progress 1/20/11 13:10:37 1/24/11 13:10:37 0.00 0.0 / 0.0 ts02_ a137_ pr78a1_ 1-- 617 Inconclusive 1/19/11 13:36:12 1/19/11 21:19:15 1.40 25.9 / 0.0 ts02_ a137_ pr78a1_ 0-- 617 Inconclusive 1/19/11 13:36:08 1/20/11 13:09:08 3.62 96.6 / 0.0 Here's one that started out that way and was resolved in my favor, even though it looks as if my machine exited early while my wingman finished: Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit ts02_ a107_ pr89a0_ 2-- 617 Valid 1/19/11 21:56:21 1/20/11 10:44:10 0.32 5.8 / 6.7 ts02_ a107_ pr89a0_ 0-- 617 Invalid 1/19/11 06:02:02 1/19/11 17:30:41 3.39 106.2 / 3.4 <- wingman ts02_ a107_ pr89a0_ 1-- 617 Valid 1/19/11 06:02:02 1/19/11 21:47:53 0.29 7.6 / 6.7 <- me So far I've noticed disagreements only on "pr" and "pca" WUs. |
||
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges: |
I had 1 Error from 111 tasks, that I can see.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kate,
----------------------------------------You're all Linux I think, so here's a filter... not seen a single one passing through Inconclusive. Thanks for watching it closely, except BOINC is bashful ;>) ttyl PS, kernel 2.6.32.28 from the 10.04 LTS distro. ts02_ a012_ pqb007_ 1-- 1292373 Valid 18-1-11 06:43:05 18-1-11 14:28:49 5.31 96.0 / 96.8 ts02_ a018_ pr78b0_ 1-- 1292373 Valid 18-1-11 07:27:51 20-1-11 12:41:16 5.54 100.3 / 107.9 ts02_ a016_ pr91b0_ 1-- 1292373 Valid 18-1-11 07:20:27 20-1-11 12:19:58 5.18 93.8 / 90.6 ts02_ a016_ pr91b1_ 0-- 1292373 Valid 18-1-11 07:20:27 20-1-11 05:21:48 5.20 94.1 / 84.0 ts02_ a016_ pr89a1_ 0-- 1292373 Valid 18-1-11 07:20:06 20-1-11 05:18:03 5.22 94.6 / 94.6 ts02_ a016_ pr91a0_ 1-- 1292373 Valid 18-1-11 07:20:07 20-1-11 00:04:34 5.25 95.0 / 95.0 ts02_ a016_ pr91a1_ 0-- 1292373 Valid 18-1-11 07:20:07 20-1-11 00:01:13 5.20 94.2 / 87.3 ts02_ a016_ pr45a0_ 0-- 1292373 Valid 18-1-11 07:19:08 19-1-11 18:25:55 5.27 95.3 / 101.3 ts02_ a016_ pr56a0_ 0-- 1292373 Valid 18-1-11 07:19:08 19-1-11 18:25:03 5.25 95.0 / 95.0 ts02_ a015_ pr89b1_ 1-- 1292373 Valid 18-1-11 07:17:11 19-1-11 13:06:34 5.04 91.3 / 91.3 ts02_ a015_ pr56b0_ 0-- 1292373 Valid 18-1-11 07:15:33 19-1-11 13:06:13 5.06 91.6 / 91.6 ts02_ a014_ pqb001_ 0-- 1292373 Pending Validation 18-1-11 06:46:24 19-1-11 02:45:14 5.57 100.9 / 0.0 ts02_ a017_ pqa003_ 0-- 1292373 Valid 18-1-11 06:51:46 19-1-11 01:38:39 5.11 92.6 / 92.1 ts02_ a014_ pqa007_ 1-- 1292373 Valid 18-1-11 06:46:05 18-1-11 22:10:31 5.59 101.1 / 93.7 ts02_ a014_ pqa008_ 0-- 1292373 Valid 18-1-11 06:46:05 18-1-11 20:04:32 5.59 101.1 / 102.5 ts02_ a012_ pqb007_ 1-- 1292373 Valid 18-1-11 06:43:05 18-1-11 14:28:49 5.31 96.0 / 96.8 ts02_ a011_ pqa007_ 1-- 1292373 Valid 18-1-11 06:39:46 18-1-11 14:22:45 5.17 93.6 / 96.9 [Edit 1 times, last edit by Former Member at Jan 20, 2011 2:03:36 PM] |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
SekeRob --
----------------------------------------You have me pegged on two counts: yes, I'm all Linux and yes, I have trouble giving bashful BOINC its privacy! I've had 3 inconclusives out of 90 returned results: 1 declared valid, 1 invalid, and 1 still inconclusive. There may have been more that turned valid before I ever saw them. I'm also running under Ubuntu 10.04.1 LTS. I am still running BOINC 6.10.17, which may be different from your setup. I have added a new crunching machine in the last 10 days (hurray!!) -- an AMD Phenom II X4 910e, so that's another variable. Kate |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
Two more inconclusives this morning, one on each of two different machines. And it's not always being my machine that ends up with the "invalid."
---------------------------------------- |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
Here's another inconclusive on yet a 3rd machine. This batch of DDDT2s definitely is giving more disagreements between wingmen than I usually see.
----------------------------------------Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit ts02_ a346_ pr34a0_ 2-- 617 Inconclusive 1/21/11 19:38:00 1/23/11 15:15:19 3.98 114.9 / 0.0 <- me ts02_ a346_ pr34a0_ 0-- 617 Inconclusive 1/21/11 19:27:48 1/22/11 02:27:30 3.70 80.2 / 0.0 ts02_ a346_ pr34a0_ 1-- 617 Error 1/21/11 18:56:05 1/21/11 18:57:20 0.00 0.0 / 0.0 ts02_ a346_ pr34a0_ 3-- - Waiting to be sent — — 0.00 0.0 / 0.0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kate --
----------------------------------------Got the first one today, and distinctly the logs are different. Mine is clean a spitting image of all the other logs on Linux and the wing[wo]man a different but normal, for windows. Before I've seen this on a retreat where suddenly the platform changed on a delayed copy and am sort of fearful it's something with the platform homogeneity... Windows mixed up with Linux. Then the 3rd copy will force the hand, a coin on it's side going either way. A Standard Linux Log: Result Name: ts02_ a047_ pla008_ 1-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Calling gridPlatform.init() INFO: No state to restore. Start from the beginning. Calling gridPlatform.init() Copying wcgrestart.rst called boinc_finish </stderr_txt> ]]> A Standard Windows log: Result Name: ts02_ a047_ pla008_ 0-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Calling gridPlatform.init() INFO: No state to restore. Start from the beginning. called boinc_finish </stderr_txt> ]]> Could the techs come out of weekend standby mode please and confirm or dismiss this. I've tested a dozen Windows from my Duo, and half dozen from Linux, and definitely, the log composition is correlating to the platform. cheers edit: send mail to techs in case off. [Edit 2 times, last edit by Former Member at Jan 23, 2011 4:48:30 PM] |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
SekeRob, thanks for looking into this yourself and for calling the techs' attention to it!
----------------------------------------The possibility of Linux/Windows log mismatches hadn't occurred to me, and of course since I run only Linux, I wouldn't have had any way of investigating even if I had thought of it. The logs on my inconclusives (including the 3 that ended up being called invalid) also look exactly like the logs on all the valid DDDT2 WUs. I got frustrated when my new crunching machine got bumped off "reliable hosts" over these unexplained invalids. For the time being, I've unchecked DDDT2 and will go back to crunching other projects. |
||
|
anhhai
Veteran Cruncher Joined: Mar 22, 2005 Post Count: 839 Status: Offline Project Badges: |
this is very interesting problem. I know this maybe slightly is off topic, but does anyone know how it will affect the WU if I am running windows and run virtual linux? I mean will the results of the WU be for a linux machine or a windows machine? I know others ran this setup for CEP2 and had no problems, but maybe that was because for CEP2 the running of the WU on either OS would result in the same results.
----------------------------------------What would happened if I am on linux and virtualize windows? |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Kate -- Got the first one today, and distinctly the logs are different. Mine is clean a spitting image of all the other logs on Linux and the wing[wo]man a different but normal, for windows. Before I've seen this on a retreat where suddenly the platform changed on a delayed copy and am sort of fearful it's something with the platform homogeneity... Windows mixed up with Linux. Then the 3rd copy will force the hand, a coin on it's side going either way. A Standard Linux Log: Result Name: ts02_ a047_ pla008_ 1-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Calling gridPlatform.init() INFO: No state to restore. Start from the beginning. Calling gridPlatform.init() Copying wcgrestart.rst called boinc_finish </stderr_txt> ]]> A Standard Windows log: Result Name: ts02_ a047_ pla008_ 0-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Calling gridPlatform.init() INFO: No state to restore. Start from the beginning. called boinc_finish </stderr_txt> ]]> Could the techs come out of weekend standby mode please and confirm or dismiss this. I've tested a dozen Windows from my Duo, and half dozen from Linux, and definitely, the log composition is correlating to the platform. cheers edit: send mail to techs in case off. Sek, Your logs you posted are about the same. The only thing different is that the one you claim as windows had a restart called. So the work unit was restarted. I checked the hr_class on this work unit and all went to linux class. What you are probably seeing the inconclusive is because when the work unit is restarted, the state of the science system is not 100% exact. We have calculations to make sure that the energies of the system are within range. What is probably happening is that the energy of the system is small which does not allow for much difference in the energies. Work unit homogeneity is working properly. -Uplinger |
||
|
|