Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 15
Posts: 15   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 10345 times and has 14 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Why computation does no longer run ?

This morning I found 2 WUs that were 'stalled' on the same PC - task manager showed cpu usage as 0. I let the WUs 'run' until the Time left went to zero (showed as a -), and then rebooted. Both tasks are running again with a valid Time left, and are using the cpu again. I will report back on what happens.
[Sep 16, 2011 11:30:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
DSFL: Why computation does no longer run ?

A suspicion is that the control process seeds a worker process after each job. Something goes awry there, or also possibly, one or the other piece of security software is not liking that process seeding another process, hence why I set exemptions on the BOINC data_dir.

Would be interested to know if this hanging occurs right at a checkpoint save. This can be recorded by adding the <checkpoint_debug>1</checkpoint_debug> flag to the cc_config.xml and setting the *Write to Disk* to default 60 seconds if different. The job slot log info would tell that too, but having it shown in the client message log permanently records this to the stdoutdae.txt file with time-stamp.

--//--
[Sep 16, 2011 11:45:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DSFL: Why computation does no longer run ?

BoincTasks shows a Checkpoint column, and there were non-zero times on display - I cant remember if the Checkpoint times were also frozen - certainly the CPU times were frozen, and the Elapsed time and Time Left values were changing. I'll check when I next spot a stalled task.

Thanks for your comments and suggestions - much appreciated.
----------------------------------------
[Edit 1 times, last edit by Former Member at Sep 16, 2011 12:48:38 PM]
[Sep 16, 2011 12:48:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DSFL: Why computation does no longer run ?

Yes, BOINCTasks shows them, but is not logging them, only telling when the last one was attempted by the science app [which one can see in the task properties as well] and only for as long as BOINCTasks is up and connected to the client. Was hoping that logged data maybe would tell when the freezing started. The gap between Elapsed and CPU time allowing to calculate back to the approximate matching time in the client log.

--//--
[Sep 16, 2011 1:13:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DSFL: Why computation does no longer run ?

Those that experience non-progressing / stuck tasks are kindly referred to this thread https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,31764 where a WCG tech is asking for observational reports [CPU activity information]

--//--
[Sep 19, 2011 4:04:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 15   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread