| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 2
|
|
| Author |
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
This is the name of the topic in the Start Here FAQ index , item 4-X presently, which has 3 sub-topics, link-marked (2) (3) (4) and Doneske suffering too seeing his WU in some sort of loop .
----------------------------------------Why this new topic? Well, I'm testing allot and saw the same same yesterday night in a CEP2 Result Log that came off my quad and tracked it back to the BOINC message log tab and thought it needed a highlight. 14-Nov-2010 19:21:31 [CPDN] Task hadsm3fub_k2e2_006460276_6 exited with zero status but no 'finished' file 14-Nov-2010 19:21:31 [CPDN] If this happens repeatedly you may need to reset the project. 14-Nov-2010 19:21:31 [WCG] Task c4cw_target02_051025434_0 exited with zero status but no 'finished' file 14-Nov-2010 19:21:31 [WCG] If this happens repeatedly you may need to reset the project. 14-Nov-2010 19:21:31 [WCG] Task E200529_783_A.26.C19H10N2OS4.185.1.set1d06_1 exited with zero status but no 'finished' file 14-Nov-2010 19:21:31 [WCG] If this happens repeatedly you may need to reset the project. 14-Nov-2010 19:21:31 [WCG] Task c4cw_target02_051019134_0 exited with zero status but no 'finished' file 14-Nov-2010 19:21:31 [WCG] If this happens repeatedly you may need to reset the project. 14-Nov-2010 19:21:31 [WCG] Task E200529_784_A.26.C19H10N2OS4.99.1.set1d06_0 exited with zero status but no 'finished' file 14-Nov-2010 19:21:31 [WCG] If this happens repeatedly you may need to reset the project. Did I RTFM?... well I wrote them FAQs, so instantly knew the very likely cure as result of me own incomplete actions: Had moved the BOINC data_dir into its own partition, logical drive L:\BOINC, all clean to itself, with larger than normal block sizes of 64K to minimize file fragmentation , and had forgotten to tell my Avast AntiVirus software to exclude scanning of that data area which in the previous install trial was restored to best for all C:\ProgramData\BOINC (Vista and W7). Why?: Because AVs have to work very hard to keep scanning what BOINC is doing, in the CEP2 case some 6600+ intermediate task files which during some checkpointing can take several minutes to read and update... the slower the disk, the longer it takes. Thus: Anyone incurring these "zero status" and "No Heartbeat, exiting messages", might want to do some RTFMs. It's not a guaranteed cure, but on busy systems will certainly help to reduce the incurring of these lost computing time events Learn and reap: See something bad in the Result Log >>>> track it back to the message log on or before the same timestamp, all stored for weeks in the stdoutdae.txt log file found in the BOINC data dir (path printed at start of client session). Support likes to hear of those too to round out the picture. Good morning world.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've had a few like that Sek.
Was wondering if I was maybe doing something else on the machine at the time ? Dunno though, cant remember. It seems to get around it in the end though. :thumbsup: Result Log Result Name: E200535_ 702_ A.23.C18H13NSSe2Si.6.2.set1d06_ 0-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. 17:52:32 (428): No heartbeat from core client for 30 sec - exiting 17:52:33 (428): No heartbeat from core client for 30 sec - exiting 17:52:34 (428): No heartbeat from core client for 30 sec - exiting 17:52:35 (428): No heartbeat from core client for 30 sec - exiting 17:52:36 (428): No heartbeat from core client for 30 sec - exiting [17:52:37] Number of jobs = 16 [17:52:37] Starting job 0,CPU time has been restored to 0.000000. No heartbeat: Exiting [17:52:39] Number of jobs = 16 [17:52:39] Starting job 0,CPU time has been restored to 0.000000. [17:54:25] Finished Job #0 [17:54:25] Starting job 1,CPU time has been restored to 92.484375. [17:59:09] Finished Job #1 [17:59:09] Starting job 2,CPU time has been restored to 365.109375. [20:17:36] Finished Job #2 [20:17:36] Starting job 3,CPU time has been restored to 8527.312500. [20:22:48] Finished Job #3 [20:22:48] Starting job 4,CPU time has been restored to 8826.203125. [20:26:25] Finished Job #4 [20:26:25] Starting job 5,CPU time has been restored to 9034.250000. [20:30:05] Finished Job #5 [20:30:05] Starting job 6,CPU time has been restored to 9244.828125. [20:33:36] Finished Job #6 [20:33:36] Starting job 7,CPU time has been restored to 9449.343750. [20:37:36] Finished Job #7 [20:37:36] Starting job 8,CPU time has been restored to 9679.421875. [20:41:03] Finished Job #8 [20:41:03] Starting job 9,CPU time has been restored to 9877.156250. [20:44:48] Finished Job #9 [20:44:48] Starting job 10,CPU time has been restored to 10089.515625. [20:52:14] Finished Job #10 [20:52:14] Starting job 11,CPU time has been restored to 10529.000000. [20:56:46] Finished Job #11 [20:56:46] Starting job 12,CPU time has been restored to 10791.562500. [21:20:22] Finished Job #12 [21:20:22] Starting job 13,CPU time has been restored to 12175.812500. [22:05:29] Finished Job #13 [22:05:29] Starting job 14,CPU time has been restored to 14855.796875. [22:46:15] Finished Job #14 [22:46:15] Starting job 15,CPU time has been restored to 17275.546875. [23:34:17] Finished Job #15 23:34:23 (1392): called boinc_finish </stderr_txt> ]]> Will suck it and see and let them run. ![]() |
||
|
|
|