| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 8
|
|
| Author |
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
I'm also posting this on the BOINC alpha list.
----------------------------------------Since updating my client to 7.0.18 I have consistently errored out all CEP2 work from WCG. This is the result log form the last one. Result Log Result Name: E206127_ 465_ C.23.C18H10N2S2Se.02136584.3.set1d06_ 0-- <core_client_version>7.0.18</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [19:50:24] Number of jobs = 16 [19:50:24] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x6c00 [ERROR] Failed to open either source or destination files while copying C.23.C18H10N2S2Se.02136584.3.noopt.bp86.sto6g.n.sp/53.0 to C.23.C18H10N2S2Se.02136584.3.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [ERROR] Failed to open either source or destination files while copying C.23.C18H10N2S2Se.02136584.3.noopt.bp86.sto6g.n.sp/stdout.txt to C.23.C18H10N2S2Se.02136584.3.noopt.bp86.sto6g.n.sp.out. Error: 2 [19:50:25] Finished Job #0 19:50:25 (12892): called boinc_finish </stderr_txt> ]]> OS is Ubuntu Oneric i386 server, kernel 3.2.4 (local compile), hardware is dual socket Xeon (Sossaman) Other WCG projects appear to be running without issue, with the possible exception of HPF2 which I don't normally run so I can't be sure. Another unit, this time from a C2D E7400 based machine: Result Log Result Name: E206091_ 782_ C.23.C18H13N3SSi.02168098.2.set1d06_ 1-- <core_client_version>7.0.18</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [02:25:19] Number of jobs = 16 [02:25:19] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x6c00 [ERROR] Failed to open either source or destination files while copying C.23.C18H13N3SSi.02168098.2.noopt.bp86.sto6g.n.sp/53.0 to C.23.C18H13N3SSi.02168098.2.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [ERROR] Failed to open either source or destination files while copying C.23.C18H13N3SSi.02168098.2.noopt.bp86.sto6g.n.sp/stdout.txt to C.23.C18H13N3SSi.02168098.2.noopt.bp86.sto6g.n.sp.out. Error: 2 [02:25:20] Finished Job #0 02:25:20 (3025): called boinc_finish </stderr_txt> ]]> OS Ubuntu Precise x86_64 server (with 32bit libs installed), kernel 3.2.0-17-generic #26-Ubuntu SMP I have others queued on two other Xeon machines (with Ubuntu 10.04 i386 desktop) and a Q6600 running Slackware 13.37 64bit (with 32bit multilib installed) that I can report on as they are processed if that helps. I don't know this is necessarily the fault of the client of course, but if this helps anyone it's worth finding out even if it's my own fault somehow. ![]() Currently being moderated under false pretences |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Let me reserve right to comment on the 7.0.18 Alpha client use. I'm slowly testing all sciences with this release and so far have HFCC/FAAH/DSFL/GFAM/SN2S/HPF2 as downloading and processing correctly with this test client. Wont be testing HCMD2 as that is pretty much run out, but we know that it worked on 7.0.17.
------------------------------------------//-- N.B. Going to formulate a BOINC test client FAQ [and sticky in Start Here], as it is starting to become too big an issue on the forums, folks using this and seeing many different problems. This client is just not general production ready to even get to a Beta level. Those not shying away, are invited to report problems at the developers forums, or join the alpha mail list, which are read by the developers. More hands on testers always welcome. edit: HPF2 tested too. [Edit 1 times, last edit by Former Member at Feb 25, 2012 11:49:31 AM] |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
Oh I acknowledge this is an alpha client and if using it costs me results, but finding a fix helps development, then so be it. I accept the cost to my points etc as a consequence of my fiddling and should I find a fix or at least isolate the issue will be adding it to the Alpha list email for the benefit of all concerned.
----------------------------------------![]() Currently being moderated under false pretences |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
Ok, one from the Slackware box.
----------------------------------------Result Log Result Name: E206219_ 441_ C.24.C18H9N3OSSe.01742986.1.set1d06_ 0-- <core_client_version>7.0.18</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [00:06:41] Number of jobs = 16 [00:06:41] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x6c00 [ERROR] Failed to open either source or destination files while copying C.24.C18H9N3OSSe.01742986.1.noopt.bp86.sto6g.n.sp/53.0 to C.24.C18H9N3OSSe.01742986.1.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [ERROR] Failed to open either source or destination files while copying C.24.C18H9N3OSSe.01742986.1.noopt.bp86.sto6g.n.sp/stdout.txt to C.24.C18H9N3OSSe.01742986.1.noopt.bp86.sto6g.n.sp.out. Error: 2 [00:06:42] Finished Job #0 00:06:42 (19550): called boinc_finish </stderr_txt> ]]> As this has been the same error message on three machines with different OS/versions that pretty much pins it down to the only common variables: the client and/or the user. ![]() Currently being moderated under false pretences |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Saw you posting in the A-List. All 9 general avail. incl. CEP2 ran fine on Windows 7-64 with this .18 build. See one previous report in CEP2 forum v.v Linux, but the fail happening previously on multiple sciences.
http://www.worldcommunitygrid.org/forums/wcg/...ead,32449_offset,0#360048 --//-- |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
This isn't happening right on start-up. When I first loaded the client the first units it grabbed were CEP2 which then tried to run before the science application files had finished downloading, so when they failed it made sense. These ones are happening after some time, a couple of days of run time at least and with nothing in the local logs other than normal unit started and unit finished messages, only they're under a minute apart.
----------------------------------------![]() Currently being moderated under false pretences |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
I reset the project on this machine in case the CEP2 application files are corrupt, but I don't think it's overly likely given it happens the same on several machines. We'll see what happens anyway.
----------------------------------------![]() Currently being moderated under false pretences |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
Ok, that didn't work.
----------------------------------------Sun 26 Feb 2012 10:32:38 AM EST | World Community Grid | [task] Process for E206307_092_C.24.C19H12N2S2Si.00873940.2.set1d06_0 exited Sun 26 Feb 2012 10:32:38 AM EST | World Community Grid | [task] process exited with status 195 Sun 26 Feb 2012 10:32:38 AM EST | World Community Grid | [task] task_state=EXITED for E206307_092_C.24.C19H12N2S2Si.00873940.2.set1d06_0 from handle_exited_app Sun 26 Feb 2012 10:32:38 AM EST | World Community Grid | [sched_op] Deferring communication for 1 min 17 sec Sun 26 Feb 2012 10:32:38 AM EST | World Community Grid | [sched_op] Reason: Unrecoverable error for task E206307_092_C.24.C19H12N2S2Si.00873940.2.set1d06_0 (process exited with code 195 (0xc3, -61)) Sun 26 Feb 2012 10:32:38 AM EST | World Community Grid | [task] result state=COMPUTE_ERROR for E206307_092_C.24.C19H12N2S2Si.00873940.2.set1d06_0 from CS::report_result_error Sun 26 Feb 2012 10:32:38 AM EST | | [cpu_sched_debug] Request CPU reschedule: application exited Result Name: E206307_ 092_ C.24.C19H12N2S2Si.00873940.2.set1d06_ 0-- <core_client_version>7.0.18</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [10:32:35] Number of jobs = 16 [10:32:35] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x6c00 [ERROR] Failed to open either source or destination files while copying C.24.C19H12N2S2Si.00873940.2.noopt.bp86.sto6g.n.sp/53.0 to C.24.C19H12N2S2Si.00873940.2.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [ERROR] Failed to open either source or destination files while copying C.24.C19H12N2S2Si.00873940.2.noopt.bp86.sto6g.n.sp/stdout.txt to C.24.C19H12N2S2Si.00873940.2.noopt.bp86.sto6g.n.sp.out. Error: 2 [10:32:36] Finished Job #0 10:32:36 (22968): called boinc_finish </stderr_txt> ]]> ![]() Currently being moderated under false pretences |
||
|
|
|