Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 72
Posts: 72   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 324745 times and has 71 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: problematic batches?

Got one too.

erlc_a193_pr34b0

<file_name>erlc_a193_pr34b0_0_2</file_name>
<error_code>-161</error_code>

All wingmen have the same problem. It's up to wingman nr 7 now.
----------------------------------------
[Edit 1 times, last edit by Former Member at Mar 27, 2010 1:07:06 PM]
[Mar 27, 2010 1:06:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

Reading Uplinger's posts, don't think we need to further report the -161 errors. The issue is identified and will be programmatically fixed.

Credit as you will have noticed is awarded for known science app caused faults.

thx for your patience and understanding.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 27, 2010 1:11:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mdparkhill
Advanced Cruncher
Joined: May 2, 2007
Post Count: 60
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

Sekerob
I have found one more of the 7-8 minute batch failures.

27-Mar-2010 09:53:28 [World Community Grid] Computation for task erlc_e104_pdb000_2 finished
27-Mar-2010 09:53:28 [World Community Grid] Output file erlc_e104_pdb000_2_2 for task erlc_e104_pdb000_2 absent

After it failed i tried to find it in my results on the grid, but couldn't find it.
This is all I found after looking thru stdout*.* file.

Regards,
Mike
----------------------------------------

[Mar 27, 2010 3:04:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

After it failed i tried to find it in my results on the grid, but couldn't find it.
You should find it. You probably looked for it before it had been reported by the client.
If you want to try again use the Status filter on Error in your Results Status page. If you have lots of results that will make the search easier.
Once you find it click on the Error link and you should see a Result Log like this:
Nom du résultat: erlc_ e059_ pr02a1_ 4--
<core_client_version>6.6.40</core_client_version>
<![CDATA[
<stderr_txt>
Calling gridPlatform.init()
INFO: No state to restore. Start from the beginning.
called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>erlc_e059_pr02a1_4_2</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>

As Sekerob said this is a normal case which will be reworked by the techs for not using the exit code to report it.
Everything is OK and you should get a few credits (about 4) later on when the specialized script will have caught it.

Cheers. Jean.
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
[Mar 27, 2010 5:41:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Randzo
Senior Cruncher
Slovakia
Joined: Jan 10, 2008
Post Count: 339
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

erlc_c014_pqa005
Completed with no error, but 2 wingmen alredy errored.
I`m good :-) just kidding
Hope next one will complete it.
[Mar 27, 2010 11:24:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

Question for the techs on this -161 error.

This morning the client received a repair job that failed after 0.9 hours. It listed with the -161 error. Just now received a repair job, copy 4, where 1/2 failed with -161 too, number 3 not yet returned [actually it did after hitting F5]. The odd thing is that the failed result of this morning has disappeared from the RS pages. Anyway, it smells like the validator is now able to catch them since the one just now exiting at 0.2 hours has gone into PV state and a message log that does not show this fail. Bit strange as the version number is still 6.17 presuming the science app writes the result log, not the server filtering out the undesirable line.

erlc_ e104_ pqa002_ 4-- - In Progress 30-3-10 21:33:30 3-4-10 21:33:30 0.00 0.0 / 0.0
erlc_ e104_ pqa002_ 3-- 617 Pending Validation 30-3-10 20:54:59 30-3-10 21:32:36 0.20 3.6 / 0.0 < moi
erlc_ e104_ pqa002_ 2-- 617 Error 30-3-10 12:08:03 30-3-10 21:33:28 0.20 3.0 / 0.0
erlc_ e104_ pqa002_ 1-- 617 Error 27-3-10 04:52:47 30-3-10 12:07:58 0.14 4.4 / 0.0
erlc_ e104_ pqa002_ 0-- 617 Error 27-3-10 04:52:45 30-3-10 20:52:15 0.12 2.5 / 0.0


Result Log

Result Name: erlc_ e104_ pqa002_ 3--
<core_client_version>6.10.36</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
called boinc_finish

</stderr_txt>
]]>

Anyway, if this is proof of concept, Hip Hip Hurray
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 30, 2010 9:40:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

Well, looks like the techs silently managed to mod something in the science app or validator or both to handle the -161 error. The fastest valid in town at 0.12 hours... thank you wingman to help confirm so quickly in the production string:

rlc_ e104_ pqa002_ 4-- 617 Valid 30-3-10 21:33:30 31-3-10 02:30:23 0.12 2.7 / 3.2 < Wingman
erlc_ e104_ pqa002_ 3-- 617 Valid 30-3-10 20:54:59 30-3-10 21:32:36 0.20 3.6 / 3.2 < Moi
erlc_ e104_ pqa002_ 2-- 617 Error 30-3-10 12:08:03 30-3-10 21:33:28 0.20 3.0 / 0.0
erlc_ e104_ pqa002_ 1-- 617 Error 27-3-10 04:52:47 30-3-10 12:07:58 0.14 4.4 / 0.0
erlc_ e104_ pqa002_ 0-- 617 Error 27-3-10 04:52:45 30-3-10 20:52:15 0.12 2.5 / 0.0
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 31, 2010 6:21:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

Yes - we released the fix to the -161 issue late yesterday (the file that isn't produced will not cause an error if it doesn't exist and the validator can check this condition properly). I apologize for not posting about it here. There are two other categories of errors that we are looking into. However, those occur with less frequency. These are the errors that are listed as '193', '-1073741819' or '29'. 193 and -1073741819 appear to be the same error (193 occurs on Linux while -1073741819 occurs on Windows). We are continuing to investigate these.
[Mar 31, 2010 1:37:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

Yes - we released the fix to the -161 issue late yesterday (the file that isn't produced will not cause an error if it doesn't exist and the validator can check this condition properly). I apologize for not posting about it here. There are two other categories of errors that we are looking into. However, those occur with less frequency. These are the errors that are listed as '193', '-1073741819' or '29'. 193 and -1073741819 appear to be the same error (193 occurs on Linux while -1073741819 occurs on Windows). We are continuing to investigate these.


Could it be that the validator now sees those WUs as valid ones? I have at least one WU (erlc_e059_pr56b1) where all tasks had error -161. But the last two replacement copies were valid so they get points while all other copies are counted as real errors and get no points at all. Maybe they already used the new version...

erlc_ e059_ pr56b1_ 6-- 617 Valid 01.04.10 07:14:02 01.04.10 08:22:01 0.16 2.5 / 2.3
erlc_ e059_ pr56b1_ 5-- 617 Valid 30.03.10 15:26:27 31.03.10 17:48:13 0.28 2.1 / 2.3
erlc_ e059_ pr56b1_ 4-- 617 Error 30.03.10 04:50:23 30.03.10 14:28:20 0.11 2.2 / 0.0
erlc_ e059_ pr56b1_ 3-- 617 Error 29.03.10 12:34:25 30.03.10 02:37:39 0.10 2.1 / 0.0
erlc_ e059_ pr56b1_ 2-- 617 Error 29.03.10 00:55:55 29.03.10 12:34:24 0.24 2.5 / 0.0
erlc_ e059_ pr56b1_ 1-- 617 Error 27.03.10 01:40:34 29.03.10 00:55:53 0.09 2.2 / 0.0
erlc_ e059_ pr56b1_ 0-- 617 Error 27.03.10 01:40:33 01.04.10 07:14:00 0.18 3.6 / 0.0
[Apr 1, 2010 3:27:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: problematic batches? <error_code>-161</error_code>

mweisensee, that's what I demonstrated with the post I made and the shorter list of errors, then 2 valids. That was the object, to not have them rejected and causing them to cycle through the error sequence till reaching 7.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Apr 1, 2010 4:00:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 72   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread