| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 8
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just a heads up for the grid techs: The recent hb0xx series of WUs are having a lot of problems with errors where the job aborts immediately upon startup. One of mine died with this error:
<core_client_version>5.4.9</core_client_version> <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> length mismatch in frag file total_residue: 0 length 2 </stderr_txt> but it's not just me. I'm seeing WUs in the log with 4 or 5 "Error" statuses returned, and 0 time claimed for each (e.g. hb052_01, hb062_03). None of these show any Valid returns, just Error or In Progress, so the In Progress ones probably haven't tried to start yet. But not all hb0xx units are borked; I'm just about to finish hb053_06_1 with no apparent problems. For what it's worth. |
||
|
|
olympic
Senior Cruncher Joined: Jun 12, 2005 Post Count: 156 Status: Offline |
Same here, 2 errors so far and the WU's were re-issued several more times with more errors and zero CPU time. I have 3 more waiting in line, we'll see how those go.
----------------------------------------![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I wonder... if a work unit returns 3 or more errors, BOINC has a quorum result. I would guess that it marks it complete and kicks it out for analysis. The 4 or 5 results you see would be due to the normal BOINC retry policy - when it has 1 or 2 error results, it would send extra copies out in the hope of reaching a valid quorum.
Thanks for reporting it, but I really expect BOINC to have reported it automatically. |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Thanks for letting us know - there are significantly more errors with the hb batch then any others we have sent out previously. We will take a look and let people know what is going on.
----------------------------------------We have BOINC configured to mark a workunit as an error once 4 results have been returned as an error. At that point no new work is issued for the workunit and we take a look. Kevin [Edit 1 times, last edit by knreed at May 31, 2006 2:00:10 PM] |
||
|
|
Goku
Advanced Cruncher France - Caen (Calvados / Normandie) Joined: Nov 30, 2004 Post Count: 84 Status: Offline Project Badges:
|
Same problem for hb089_06
----------------------------------------hb089_06 Other 05/31/2006 11:57:52 05/31/2006 17:23:48 4.29 23 / 0 hb089_06 Error 05/31/2006 07:49:29 05/31/2006 14:19:12 3.94 26 / 0 hb089_06 Error 05/31/2006 07:38:50 05/31/2006 11:53:29 1.97 13 / 0 hb089_06 Other 05/31/2006 07:37:22 05/31/2006 12:52:46 2.11 25 / 0 and for ha008_04 ? ha008_04 Error 05/31/2006 17:42:27 05/31/2006 18:11:35 0.00 0 / 0 ha008_04 Error 05/31/2006 17:23:49 05/31/2006 17:33:59 0.00 0 / 0 ha008_04 Error 05/31/2006 17:08:12 05/31/2006 17:18:27 0.00 0 / 0 ha008_04 In Progress 05/31/2006 17:07:02 06/07/2006 17:07:02 0.00 0 / 0 ha008_04 In Progress 05/31/2006 17:00:14 06/07/2006 17:00:14 0.00 0 / 0 ha008_04 Other 01/01/1970 00:00:00 01/01/1970 00:00:00 0.00 0 / 0 ---------------------------------------- [Edit 1 times, last edit by MaitreYoda at May 31, 2006 7:26:18 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Guys, knreed wrote
----------------------------------------We have seen a large number of errors with the hbXXX series of workunits for Human Proteome Folding on BOINC. We are looking into the problem now and will let you know what we find. That means anything starting with hb.... you can stop those and try others or try Faah to help find a cure against HIV
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The latest on this is here: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=7442
No more HPF units are being sent out at the moment - switch to FA@H until its sorted. |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
We have re-enabled the Human Proteome Folding project and are running workunits from batch 'ex'.
|
||
|
|
|