| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 52
|
|
| Author |
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Even for the devs, the moment the system moves these results off, the links go nowhere. Copy/Paste preserves the information!
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Result Log
Result Name: E236452_ 280_ S.234.C18Ge1H14N2S2Se1.SHKRLNRXMPNCHC-UHFFFAOYSA-N.11_ s1_ 14_ 1-- <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [11:42:17] Number of jobs = 8 [11:42:17] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x1 [11:42:18] Finished Job #0 11:42:23 (4764): called boinc_finish </stderr_txt> ]]> |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
For the Wingman it is the same.
Result Log Result Name: E236452_ 280_ S.234.C18Ge1H14N2S2Se1.SHKRLNRXMPNCHC-UHFFFAOYSA-N.11_ s1_ 14_ 0-- <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [06:39:11] Number of jobs = 8 [06:39:11] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x1 [06:39:16] Finished Job #0 06:39:22 (4696): called boinc_finish </stderr_txt> ]]> |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Result Log
Result Name: E236459_ 564_ S.254.C22Ge1H18O1S4.FOJQOZXGJBZDSI-UHFFFAOYSA-N.11_ s1_ 14_ 3-- <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [21:44:50] Number of jobs = 8 [21:44:50] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x1 [21:44:51] Finished Job #0 21:44:57 (5800): called boinc_finish </stderr_txt> ]]> |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
GE and SE are killing the algorithm always or very often. Is there a bug fix coming in the near future?
Result Log Result Name: E236444_ 844_ S.204.C22H14O1S1Se1.PCUMVGUKSLMUNL-UHFFFAOYSA-N.10_ s1_ 14_ 3-- <core_client_version>7.4.42</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [18:02:03] Number of jobs = 8 [18:02:03] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x1 [08:08:25] Finished Job #0 08:08:26 (10704): called boinc_finish </stderr_txt> ]]> |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Result Log
Result Name: E236467_ 595_ S.272.C26Ge1H20O2S3.YFWFHENQBDWKCP-UHFFFAOYSA-N.18_ s1_ 14_ 3-- <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3) </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [14:34:29] Number of jobs = 8 [14:34:29] Starting job 0,CPU time has been restored to 0.000000. Application exited with RC = 0x1 [14:34:30] Finished Job #0 14:34:35 (8880): called boinc_finish </stderr_txt> ]]> |
||
|
|
Gatchaman
Cruncher Joined: Feb 29, 2012 Post Count: 49 Status: Offline |
Correct me if I'm wrong but when a normal WU errors out aren't +-10 results from that box double checked or validated by a wingman? Have I got that wrong?
----------------------------------------If not, does that mean a lot of the WU we are actually crunching as I type are just validation jobs? Validating WU results that are actually "okay", but because of these Ge WU's we are being forced to pointlessly validate each other over and over and over and over ......... again until these Ge WU are either pulled or a fix is found. ![]() "Sadly this project is turning into nonscience......" |
||
|
|
svincent
Advanced Cruncher Joined: Jan 3, 2009 Post Count: 53 Status: Offline Project Badges:
|
I believe the figure 10 used to refer to the maximum number of times a workunit might be sent out if it was returning incorrect results. I think it's been reduced to 4 or 5 now. And even in this case if two clients successfully complete a task no more workunits will be sent out.
And in any case the Ge workunits are failing immediately so no time is being wasted: it's the workunits that run for 15 hours and then fail that are the issue. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Gatchaman is talking about reliability rating. After an error result, N sequential valid are required before the device is allowed to do it alone again. Typically see because of latency with wingman, the number is N+.
|
||
|
|
Gatchaman
Cruncher Joined: Feb 29, 2012 Post Count: 49 Status: Offline |
So basically we're stuck in an endless loop of validating each other until those Ge WU are pulled.
----------------------------------------Sigh.... Easter break is coming up. Good chance I'll power down the boxes until this situation has been cleared up and we can get back to crunching normal WU's. Any feedback from the science guys and girls would be greatly appreciated. ![]() "Sadly this project is turning into nonscience......" |
||
|
|
|