Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Drug Search for Leishmaniasis Forum Thread: Why computation does no longer run ? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 15
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This morning I found 2 WUs that were 'stalled' on the same PC - task manager showed cpu usage as 0. I let the WUs 'run' until the Time left went to zero (showed as a -), and then rebooted. Both tasks are running again with a valid Time left, and are using the cpu again. I will report back on what happens.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
A suspicion is that the control process seeds a worker process after each job. Something goes awry there, or also possibly, one or the other piece of security software is not liking that process seeding another process, hence why I set exemptions on the BOINC data_dir.
Would be interested to know if this hanging occurs right at a checkpoint save. This can be recorded by adding the <checkpoint_debug>1</checkpoint_debug> flag to the cc_config.xml and setting the *Write to Disk* to default 60 seconds if different. The job slot log info would tell that too, but having it shown in the client message log permanently records this to the stdoutdae.txt file with time-stamp. --//-- |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
BoincTasks shows a Checkpoint column, and there were non-zero times on display - I cant remember if the Checkpoint times were also frozen - certainly the CPU times were frozen, and the Elapsed time and Time Left values were changing. I'll check when I next spot a stalled task.
----------------------------------------Thanks for your comments and suggestions - much appreciated. [Edit 1 times, last edit by Former Member at Sep 16, 2011 12:48:38 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yes, BOINCTasks shows them, but is not logging them, only telling when the last one was attempted by the science app [which one can see in the task properties as well] and only for as long as BOINCTasks is up and connected to the client. Was hoping that logged data maybe would tell when the freezing started. The gap between Elapsed and CPU time allowing to calculate back to the approximate matching time in the client log.
--//-- |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Those that experience non-progressing / stuck tasks are kindly referred to this thread https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,31764 where a WCG tech is asking for observational reports [CPU activity information]
--//-- |
||
|
|