Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 14
Posts: 14   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 60881 times and has 13 replies Next Thread
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: exited with zero status but no 'finished' file

Thanks, Sek. I think your statement "anything that virtually monopolizes the CPU time for > 30 seconds can cause this" is the key. I'm not doing any other work with the machine, but something is hogging the CPU occasionally. As I mentioned in the XS thread, just before one of the crashes, scrolling of BOINC Manager messages froze. I switched to Task Manager. The 4 x DDDT2 WUs were still at 25% each, but a couple of sec later I saw the CPU usage dropping in the little Task Manager icon in the system tray. In the Performance tab graphs, CPU usage in all cores was fluctuating wildly. It recovered to about 100% very briefly and then the DDDT2s crashed & restarted.

Now what is hogging the CPU?
Am running 2 x HFCC + 2 x DDDT2 to see whether the crash interval is affected. Also have Task Mgr Process tab sorted in reverse CPU usage order. Ability to record snapshots would be so-o-o-o useful.
----------------------------------------
[Edit 1 times, last edit by Rickjb at Aug 17, 2010 6:05:42 PM]
[Aug 17, 2010 5:51:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: exited with zero status but no 'finished' file

Set the BM to "Show active Tasks". It was a stop-gap measure as the refreshing of that element takes considerable juice, particularly when caches are bigger. I always exit the BM leaving the core client running when I don't need to use the GUI. BM and core client talk with each other over the well described port 31416 and that IP 127.0.0.1 with RPC calls. Those need to be well and truly uninhibited.

I need to add that the hogging is then enough dense that no interrupt can get through. BOINC sciences are running at idle priority (nice 19) trying to sneak spare cycle and for that the core client (CC) is always listening with one ear and telling the sciences it's okay. The sciences on their turn listen to the CC, but when it's silent for 30 seconds... lemming action.

That BFS link you put up and removed again is vague enough to be left pulling hair. Volunteer computing needing an IT degree or we simply just let it go as being part of it... fails happen... it's not going to stop me to use the system as I want to.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Aug 17, 2010 6:09:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: exited with zero status but no 'finished' file

> Sekerob said: Set the BM to "Show active Tasks"
?? I don't understand active Tasks - all I know in Boinc Manager (BM) is the (all) Tasks tab.

Nothing is actually hogging the CPU (!!!) As I mentioned in recents edits to my XS thread , I've managed to observe the failure events in Task Manager, and have saved some screenshots of the Performance tab: leading into a crash here (crash 1) and pagefile usage thru a crash (crash 2) here
The CPU activity (%) falls erratically for 20-30 seconds before the WCG tasks die, and the "System Idle Process" (%) takes up the missing CPU activity (%). Something is causing the CPU to become inactive, then the BOINC processes do not respond, and you've described the rest.

Thank you for the explanation of the "how". Now to try to find the "why".

My homebuilt Z80-CP/M system of 1980 didn't suffer this kind of @@#!! but I guess that's progress.

I withdrew the link to the BoincFaqService because I discovered that you have already included it in your FAQ ( BOINC: "Zero Status" & "If...atedly...." Messages ), it being vague or otherwise. Last link in your Further info section near the bottom of the page: Result '(result)' exited with zero st...o 'finished' file smile
[Aug 18, 2010 3:01:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: exited with zero status but no 'finished' file

Sorry, from client 6.6 or so there's in the Tasks tab there's that Show Active Tasks button that will hide all the Ready to Start WU's which reduces the load on building up a refresh considerably. One other reason I prefer BOINCTasks: It allows the filter/contraction of the RtS jobs so it will for instance say CEP2 12 tasks for 2:06 days. On it's project tab it will than show e.g. 0.55 days per core (4) so you know pretty exact what's queued without having to pound a calculator.

Blame disk i/o too. I've yet to find a defrag function in Linux for a reason, my main crunching platform these days. The VM size momentarily is just 65MB. Under Windows that would be 3.5GB at least on a quad with 3GB ram.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Aug 18, 2010 3:13:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 14   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread