Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Human Proteome Folding - Phase 2 Thread: WU Errors |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 16
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
qj800 batch is not being nice.. all systems getting errored out with them..
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
When, start middle end?
----------------------------------------What system, windows, linux, mac? Messages, in client and result log on result status page? edit: Just fetched a qj805 and running so far so good, past first checkpoint. Assume this is then limited to qj800, and not the series. [Edit 1 times, last edit by Former Member at Dec 3, 2012 6:30:49 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
hmmm, in the series to try see what choj01 sees [it's a full guess], had an qj808 go south on about the 4th checkpoint [Linux] with:
2047 World Community Grid 12/3/2012 9:21:09 PM [sched_op] Deferring communication for 1 min 42 sec 2048 World Community Grid 12/3/2012 9:21:09 PM [sched_op] Reason: Unrecoverable error for task qj808_00057_5 2049 World Community Grid 12/3/2012 9:21:09 PM Computation for task qj808_00057_5 finished 2050 World Community Grid 12/3/2012 9:21:09 PM Output file qj808_00057_5_0 for task qj808_00057_5 absent BOINCTasks logged a 193 6.40 hpf2 qj808_00057_5 00:20:21 (00:20:16) 12/3/2012 9:21:44 PM 12/3/2012 9:23:44 PM 99,59 Reported: Computation error (193,) And the Result log list a SIGSEGV ( A classic) Result Log Result Name: qj808_ 00057_ 5-- <core_client_version>7.0.39</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> SIGSEGV: segmentation violation Stack trace (18 frames): [0x8789e9f] [0x877cfa4] [0xf77a1400] [0x87dfb6f] [0x87badaa] [0x876d273] [0x822fc2f] [0x843b2da] [0x843c503] [0x870e50b] [0x85e9a87] [0x85eb7c5] [0x805cf24] [0x8331f6b] [0x83f3cdd] [0x83f3f5c] [0x87ed062] [0x8048131] Exiting... </stderr_txt> ]]> Error list description is reflective of the Absent output file problem. EXIT_SIGNAL 193 The client, Manager and/or application will exit when getting the exit signal. Really no idea if this is the same issue as seen by the OP poster. Got 805, 806, 808 and 809 running in this quad. HPF2 *was* known to be rocksolid on Linux [not suffering the infamous /711 bug that Windows tasks occasionally display]. |
||
|
[SG-FC] dingdong
Cruncher Joined: Nov 26, 2007 Post Count: 8 Status: Offline Project Badges: |
lost 8 WU qj817_xxx in the first seconds from begining
Result Log Result Name: qj817_ 00020_ 13-- <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> Unzul�ssige Funktion. (0x1) - exit code 1 (0x1) </message> <stderr_txt> ERROR:: Exit at: .\nblist.cc line:711 </stderr_txt> ]]> |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
lost 8 WU qj817_xxx in the first seconds from begining Result Log Result Name: qj817_ 00020_ 13-- <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> Unzul�ssige Funktion. (0x1) - exit code 1 (0x1) </message> <stderr_txt> ERROR:: Exit at: .\nblist.cc line:711 </stderr_txt> ]]> This is the "infamous /711 bug" Rob was refering to in his last post, so probably NOT the one mentioned in the OP. |
||
|
ICG Studio
Cruncher Joined: Jan 17, 2011 Post Count: 12 Status: Offline Project Badges: |
yeah I got the same problem please check my icg computer:
windows 7 x64, amd phenom x6, 8gb ram, 2x gpu. Its happen when i do something on opencl (einstein@home or poem, when utilize cpu close to 100% something wrong happen after 2 - 4 sec with HPF WU).... so i can say all on the begining. Second computer: laptop amd dual core and when i crunch with CAL Collatz, the same error after few second just on HPF WU, all other project from World Community Grid and outside WCG work fine... 0 errors. |
||
|
ICG Studio
Cruncher Joined: Jan 17, 2011 Post Count: 12 Status: Offline Project Badges: |
latest wu:
qj902_ 00117_ 12-- - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0xA10E7210 read attempt to address 0xA10E7210 qj902_ 00116_ 8-- - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0xA10E7210 read attempt to address 0xA10E7210 all of them.... on laptop: WU qj669_ 00081_ 7-- <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> powell exceeding maximum iterations ERROR:: Exit at: .\dock_structure.cc line:401 </stderr_txt> ]]> |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The line:401 was the one we had quite a few some years ago and then when a fix was applied [very hard to replicate] the 711 came instead, but 10x less frequent. Had not thought there still could be the occasional line:401
If clients start throwing serial errors out of nowhere, and there's no reporting by others within a short period, it's most often a system problem. Step 1: Restart system. |
||
|
ICG Studio
Cruncher Joined: Jan 17, 2011 Post Count: 12 Status: Offline Project Badges: |
Thank You for advice. I try then later reboot and feed up my machine with hpf2 wu:)
Kind regards Stef |
||
|
themoonscrescent
Veteran Cruncher UK Joined: Jul 1, 2006 Post Count: 1320 Status: Offline Project Badges: |
<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1) </message> <stderr_txt> ERROR:: Exit at: .\nblist.cc line:711 </stderr_txt> ]]> I am getting this as well, on 2 machines, have tried rebooting both but to no avail.. 41 errors and counting, all reading the above? [Edit 1 times, last edit by themoonscrescent at Dec 10, 2012 9:02:39 AM] |
||
|
|