| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 10
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi.
This one has failed four times on different rigs. javascript:addHostPopup('/ms/device/viewWorkunitStatus.do?workunitId=167309754','statusPop',650,650); E200104_ 194_ A.19.C15H10N2S2.47.1.set1d06_ 4 <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [13:22:19] Number of jobs = 16 [13:22:19] Starting job 0,CPU time has been restored to 0.000000. [13:22:27] Starting new Job Application exited with RC = 0x100 [ERROR] Failed to open either source or destination files while copying A.20.C15H7N3OS.59.0.noopt.bp86.sto6g.n.sp/53.0 to A.20.C15H7N3OS.59.0.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [13:22:28] Finished Job #0 called boinc_finish Exiting 195 </stderr_txt> ]]> |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Error 195 is known to the techs and the crunching public, who suffered this. See http://www.worldcommunitygrid.org/forums/wcg/...ead,29284_offset,0#284653 and https://secure.worldcommunitygrid.org/forums/...ad,29238_offset,50#284916
----------------------------------------Many volunteers did impromptu installs of Ubuntu on USB sticks and the like to get into CEP2 which may not have created a stable condition for this science. Your setups may be the key.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sekerob.
quote// Your setups may be the key. //quote My Ubuntu is installed on the hard drive always has been and is very stable, thanks all the same. Maybe the task is at fault, this time? |
||
|
|
Randzo
Senior Cruncher Slovakia Joined: Jan 10, 2008 Post Count: 339 Status: Offline Project Badges:
|
Has been this error discused?
<core_client_version>6.10.17</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. SIGSEGV: segmentation violation Stack trace (14 frames): [0x86d160f] [0x873aeb0] [0xb7819400] [0x83a270f] [0x8301d57] [0x82e7528] [0x8211237] [0x81ef582] [0x81f5447] [0x86aaad8] [0x86ae11a] [0x804f4d3] [0x873cfaa] [0x8048131] Exiting... </stderr_txt> ]]> |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
-193 or 193. The minus sign present or not indicates its either BOINC or system related. SIGSEGV, segmentation violation read grave to me. http://en.wikipedia.org/wiki/SIGSEGV
-----------------------------------------193 per the BFS is found here: http://boincfaq.mundayweb.com/index.php?langu...41bf34d6d4d3d90c9fea5e089 and apparently no longer in use by newer BOINC to signal an issue. Found about a 30 cases where 193 were mentioned on the forums. The search term to use is code AND 193 AND (0xc1, AND -63) results in: http://www.worldcommunitygrid.org/forums/wcg/...=0&sort=1&rows=20 Not allot on
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Aug 22, 2010 7:41:36 AM] |
||
|
|
seippel
Former World Community Grid Tech Joined: Apr 16, 2009 Post Count: 392 Status: Offline Project Badges:
|
Hi. This one has failed four times on different rigs. javascript:addHostPopup('/ms/device/viewWorkunitStatus.do?workunitId=167309754','statusPop',650,650); E200104_ 194_ A.19.C15H10N2S2.47.1.set1d06_ 4 <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [13:22:19] Number of jobs = 16 [13:22:19] Starting job 0,CPU time has been restored to 0.000000. [13:22:27] Starting new Job Application exited with RC = 0x100 [ERROR] Failed to open either source or destination files while copying A.20.C15H7N3OS.59.0.noopt.bp86.sto6g.n.sp/53.0 to A.20.C15H7N3OS.59.0.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [13:22:28] Finished Job #0 called boinc_finish Exiting 195 </stderr_txt> ]]> PPL, We have received a handful of these errors and are currently working with the research team to determine if there is anything that can be done to eliminate them in the future. Thank you for your continued contribution and patience. Seippel |
||
|
|
3A4scLiRhJVcdT2K9q9kQNxzxYJ9
Advanced Cruncher Joined: Nov 16, 2009 Post Count: 72 Status: Offline |
Right now I have 48 pages of error (out of 49) which are connected with CEP2 only. All of there errors are coming from my linux-machine. I have switched to C4CW and I only got 5 errors so far. I will let the setting this way untill I see some improvements on CEP2 for my linux system. I use the 6.10.56 interface on that machine right now, I'll switch to 6.10.58 if it gets final before Sept 18th.
Error logs are quite different... ranging from Application exited with RC = 0x9 [ERROR] Failed to open either source or destination files while copying A.23.C19H12N2OSi.52.0.noopt.bp86.sto6g.n.sp/53.0 to A.23.C19H12N2OSi.52.0.noopt.bp86.sto6g.n.sp.53.0. Error: 2 [ERROR] Failed to open either source or destination files while copying A.23.C19H12N2OSi.52.0.noopt.bp86.sto6g.n.sp/stdout.txt to A.23.C19H12N2OSi.52.0.noopt.bp86.sto6g.n.sp.out. Error: 2 [15:52:11] Finished Job #0 ...to... INFO: No state to restore. Start from the beginning. [13:52:41] Number of jobs = 16 [13:52:41] Starting job 0,CPU time has been restored to 0.000000. [13:52:56] Starting new Job [13:52:58] Qink name = fldman [13:53:09] Qink name = gesman [13:53:09] Qink name = scfman [13:54:43] Qink name = anlman [13:54:45] End of Job [13:54:48] Finished Job #0 [13:54:48] Starting job 1,CPU time has been restored to 76.784327. [13:54:48] Starting new Job [13:54:48] Qink name = fldman [13:54:49] Qink name = gesman [13:54:50] Qink name = scfman </stderr_txt> |
||
|
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges:
|
Do a restart and crunch more projects and you will get less errors.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have all but three projects sending to me. As we are are alone so far on cep2 most of what I get is from that one. I have returned 344 WUs for just over 82 days worth so far.
I do have one error listed and seem to remember another one that may have just disappeared due to age. I know the error I have listed was when, like right now, I had four of the buggers running at once. Usually I can avoid that and I have had valid returns from many that had multiple errors. Now, don't get me wrong here, I am not adverse to folks that like getting badges. Not my thing but I can see how it could be. Running just one project is, I think, a great way to have more errors. I bet if you add cep2 back on with the c4cw (no errors so far here but only 18 returned) and you would have fewer errors. Fewer errors means more good returns in the long run. |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
Running just one project is, I think, a great way to have more errors. That cannot be put as a general statement.I usually run a single project per machine and I have errors only when WUs are in error. However i don't put any project in any machine. And it is true that running CEP2 alone in a machine which has not as much memory as its threads would need is taking a serious risk. In my quad with 2 GB it is possible that 4 CEP2 tasks run fine, but at the beginning of this project I have seen occasions where it triggered an awful memory shortage when what I was doing was also needing much RAM. This is resulting in a serious drop of performance at best, and if RAM and HD are not in top condition it may probably lead to errors. Each CEP2 WU encapsulates 16 distinct jobs of various sizes and if it happens that the most demanding of these jobs are running at the same time in a multicore then things can get bad. Nowadays I never leave 4 CEP2 WUs running unattended in this quad. Also I shall certainly not run them in my old P4HT or in my EeePC, when the Windows version is released. For these smaller machines HCMD2, HCC and C4CW are certainly more appropriate. |
||
|
|
|