| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 72
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Got one too.
----------------------------------------erlc_a193_pr34b0 <file_name>erlc_a193_pr34b0_0_2</file_name> <error_code>-161</error_code> All wingmen have the same problem. It's up to wingman nr 7 now. [Edit 1 times, last edit by Former Member at Mar 27, 2010 1:07:06 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Reading Uplinger's posts, don't think we need to further report the -161 errors. The issue is identified and will be programmatically fixed.
----------------------------------------Credit as you will have noticed is awarded for known science app caused faults. thx for your patience and understanding.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
mdparkhill
Advanced Cruncher Joined: May 2, 2007 Post Count: 60 Status: Offline Project Badges:
|
Sekerob
----------------------------------------I have found one more of the 7-8 minute batch failures. 27-Mar-2010 09:53:28 [World Community Grid] Computation for task erlc_e104_pdb000_2 finished 27-Mar-2010 09:53:28 [World Community Grid] Output file erlc_e104_pdb000_2_2 for task erlc_e104_pdb000_2 absent After it failed i tried to find it in my results on the grid, but couldn't find it. This is all I found after looking thru stdout*.* file. Regards, Mike ![]() |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
After it failed i tried to find it in my results on the grid, but couldn't find it. You should find it. You probably looked for it before it had been reported by the client.If you want to try again use the Status filter on Error in your Results Status page. If you have lots of results that will make the search easier. Once you find it click on the Error link and you should see a Result Log like this: Nom du résultat: erlc_ e059_ pr02a1_ 4-- <core_client_version>6.6.40</core_client_version> <![CDATA[ <stderr_txt> Calling gridPlatform.init() INFO: No state to restore. Start from the beginning. called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>erlc_e059_pr02a1_4_2</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> As Sekerob said this is a normal case which will be reworked by the techs for not using the exit code to report it. Everything is OK and you should get a few credits (about 4) later on when the specialized script will have caught it. Cheers. Jean. |
||
|
|
Randzo
Senior Cruncher Slovakia Joined: Jan 10, 2008 Post Count: 339 Status: Offline Project Badges:
|
erlc_c014_pqa005
Completed with no error, but 2 wingmen alredy errored. I`m good :-) just kidding Hope next one will complete it. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Question for the techs on this -161 error.
----------------------------------------This morning the client received a repair job that failed after 0.9 hours. It listed with the -161 error. Just now received a repair job, copy 4, where 1/2 failed with -161 too, number 3 not yet returned [actually it did after hitting F5]. The odd thing is that the failed result of this morning has disappeared from the RS pages. Anyway, it smells like the validator is now able to catch them since the one just now exiting at 0.2 hours has gone into PV state and a message log that does not show this fail. Bit strange as the version number is still 6.17 presuming the science app writes the result log, not the server filtering out the undesirable line. erlc_ e104_ pqa002_ 4-- - In Progress 30-3-10 21:33:30 3-4-10 21:33:30 0.00 0.0 / 0.0 erlc_ e104_ pqa002_ 3-- 617 Pending Validation 30-3-10 20:54:59 30-3-10 21:32:36 0.20 3.6 / 0.0 < moi erlc_ e104_ pqa002_ 2-- 617 Error 30-3-10 12:08:03 30-3-10 21:33:28 0.20 3.0 / 0.0 erlc_ e104_ pqa002_ 1-- 617 Error 27-3-10 04:52:47 30-3-10 12:07:58 0.14 4.4 / 0.0 erlc_ e104_ pqa002_ 0-- 617 Error 27-3-10 04:52:45 30-3-10 20:52:15 0.12 2.5 / 0.0 Result Log Result Name: erlc_ e104_ pqa002_ 3-- <core_client_version>6.10.36</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. called boinc_finish </stderr_txt> ]]> Anyway, if this is proof of concept, Hip Hip Hurray
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Well, looks like the techs silently managed to mod something in the science app or validator or both to handle the -161 error. The fastest valid in town at 0.12 hours... thank you wingman to help confirm so quickly in the production string:
----------------------------------------rlc_ e104_ pqa002_ 4-- 617 Valid 30-3-10 21:33:30 31-3-10 02:30:23 0.12 2.7 / 3.2 < Wingman erlc_ e104_ pqa002_ 3-- 617 Valid 30-3-10 20:54:59 30-3-10 21:32:36 0.20 3.6 / 3.2 < Moi erlc_ e104_ pqa002_ 2-- 617 Error 30-3-10 12:08:03 30-3-10 21:33:28 0.20 3.0 / 0.0 erlc_ e104_ pqa002_ 1-- 617 Error 27-3-10 04:52:47 30-3-10 12:07:58 0.14 4.4 / 0.0 erlc_ e104_ pqa002_ 0-- 617 Error 27-3-10 04:52:45 30-3-10 20:52:15 0.12 2.5 / 0.0
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Yes - we released the fix to the -161 issue late yesterday (the file that isn't produced will not cause an error if it doesn't exist and the validator can check this condition properly). I apologize for not posting about it here. There are two other categories of errors that we are looking into. However, those occur with less frequency. These are the errors that are listed as '193', '-1073741819' or '29'. 193 and -1073741819 appear to be the same error (193 occurs on Linux while -1073741819 occurs on Windows). We are continuing to investigate these.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yes - we released the fix to the -161 issue late yesterday (the file that isn't produced will not cause an error if it doesn't exist and the validator can check this condition properly). I apologize for not posting about it here. There are two other categories of errors that we are looking into. However, those occur with less frequency. These are the errors that are listed as '193', '-1073741819' or '29'. 193 and -1073741819 appear to be the same error (193 occurs on Linux while -1073741819 occurs on Windows). We are continuing to investigate these. Could it be that the validator now sees those WUs as valid ones? I have at least one WU (erlc_e059_pr56b1) where all tasks had error -161. But the last two replacement copies were valid so they get points while all other copies are counted as real errors and get no points at all. Maybe they already used the new version... erlc_ e059_ pr56b1_ 6-- 617 Valid 01.04.10 07:14:02 01.04.10 08:22:01 0.16 2.5 / 2.3 erlc_ e059_ pr56b1_ 5-- 617 Valid 30.03.10 15:26:27 31.03.10 17:48:13 0.28 2.1 / 2.3 erlc_ e059_ pr56b1_ 4-- 617 Error 30.03.10 04:50:23 30.03.10 14:28:20 0.11 2.2 / 0.0 erlc_ e059_ pr56b1_ 3-- 617 Error 29.03.10 12:34:25 30.03.10 02:37:39 0.10 2.1 / 0.0 erlc_ e059_ pr56b1_ 2-- 617 Error 29.03.10 00:55:55 29.03.10 12:34:24 0.24 2.5 / 0.0 erlc_ e059_ pr56b1_ 1-- 617 Error 27.03.10 01:40:34 29.03.10 00:55:53 0.09 2.2 / 0.0 erlc_ e059_ pr56b1_ 0-- 617 Error 27.03.10 01:40:33 01.04.10 07:14:00 0.18 3.6 / 0.0 |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
mweisensee, that's what I demonstrated with the post I made and the shorter list of errors, then 2 valids. That was the object, to not have them rejected and causing them to cycle through the error sequence till reaching 7.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
|