Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 19
Posts: 19   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 11631 times and has 18 replies Next Thread
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Never ending WUs blocking slots

I have complained about this probably the first time more than a year ago, but recently, this has been creeping up more in a couple of variations. One of them I just noticed today, with the properties of the WU (before I aborted it) showing
Application                  Mapping Cancer Markers 7.41 
Name MCM1_0160058_6568
State Running
Received 3/3/2020 10:01:01 AM
Report deadline 3/10/2020 11:01:02 AM
Estimated computation size 47,923 GFLOPs
CPU time 23:55:27
CPU time since checkpoint 00:05:10
Elapsed time 22:04:19
Estimated time remaining ---
Fraction done 100.000%
Virtual memory size 6.39 MB
Working set size 7.54 MB
Directory slots/11
Process ID 13360
Progress rate 4.680% per hour
Executable wcgrid_mcm1_map_7.41_windows_x86_64

The host in question is an idle sitting 8 thread i7, with 8GB of RAM, running Windows 10/1903.
Did notice this and a similar various on all kinds of hosts, including brand spanking new 9th Gen i5-9700 (6 cores, 6 threads) with 16GB of RAM, running Windows 10/1909 and the latest BOINC client for download from WCG.

Ralf sad
----------------------------------------

[Mar 6, 2020 12:39:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7660
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

Have you tried either the suspend/resume routine or the shutdown and/or reboot routine to see if that allows the work unit to continue normally?

Edit: I do not have any machines running Win 10, but have never experienced this behavior in either Win 7 or in Linux.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Mar 6, 2020 2:23:31 AM]
[Mar 6, 2020 2:22:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

Have you tried either the suspend/resume routine or the shutdown and/or reboot routine to see if that allows the work unit to continue normally?
Yes, it doesn't help. Beside that there are a lot of hosts that I simply can't babysit all the time.
Edit: I do not have any machines running Win 10, but have never experienced this behavior in either Win 7 or in Linux.
Cheers
I have only one Linux machine running right now (Linux Mint 19.3, Core Duo with 3GB of RAM), where I don't think I have ever seen this problem.
I have seen this IIRC once on my my Macbook Pro (i7 8T, 8GB RAM, High Sierra), but otherwise across a wider range of Windows hosts, including Windows 7, Windows 8.1 (the very machine I am typing this) as well as Windows Server 2003, besides Windows 10. So I don't think that this is a Windows 10 specific issue, I just mentioned this initially because in the past, some people have been harping about the fact that it might have been an older CPU, or Windows version or not the latest BOINC client. But there doesn't seem to any obvious circumstances when this happens and I usually only notice this when I check the daily stats and notice a larger than usual deviation in the points for a stats update.

Ralf
----------------------------------------

[Mar 6, 2020 7:36:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

And here's another one, from a different host (i7 8T, 8GB RAM, Windows 10/1909)
Application                  Mapping Cancer Markers 7.41 
Name MCM1_0160091_0363
State Running
Received 3/4/2020 5:29:33 PM
Report deadline 3/11/2020 6:29:33 PM
Estimated computation size 47,859 GFLOPs
CPU time 04:24:58
CPU time since checkpoint 00:07:06
Elapsed time 13:07:20
Estimated time remaining ---
Fraction done 100.000%
Virtual memory size 6.39 MB
Working set size 10.47 MB
Directory slots/3
Process ID 11572
Progress rate 7.560% per hour
Executable wcgrid_mcm1_map_7.41_windows_x86_64


----------------------------------------

[Mar 6, 2020 4:16:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hanski
Senior Cruncher
Finland
Joined: Nov 14, 2005
Post Count: 157
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

I have the same problem in one machine with Win10. But rebooting the machine has always helped the problem. The hard disk in the machine is failing and I've think that is the reason.
----------------------------------------
<><
[Mar 6, 2020 4:27:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

I have the same problem in one machine with Win10. But rebooting the machine has always helped the problem. The hard disk in the machine is failing and I've think that is the reason.
In all of my cases, rebooting has not helped. Either the WU keeps running/blocking a slot at 100% or in some cases terminates quickly with "Computation Error"...
And it happens even on brand spanking new PCs, within hours of taking them out of the shipping box and initial setup, and ALL machines are everyday used workstations (though sometimes idle for prolonged period of times). And beside those seemingly random blocking WUs, they are crunching hundreds if not thousands WUs at the same time just fine...

Ralf
----------------------------------------

[Mar 6, 2020 5:13:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jim1348
Veteran Cruncher
USA
Joined: Jul 13, 2009
Post Count: 1066
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

I have only one Linux machine running right now (Linux Mint 19.3, Core Duo with 3GB of RAM), where I don't think I have ever seen this problem.

Same here. On Windows, I always suspect the anti-virus for hanging up work. It may help to exclude the BOINC folders, but that may not stop the real-time monitoring processes from interfering as they inspect the work, or else the communications on the Internet.
[Mar 6, 2020 6:06:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1950
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

I have only one Linux machine running right now (Linux Mint 19.3, Core Duo with 3GB of RAM), where I don't think I have ever seen this problem.

Same here. On Windows, I always suspect the anti-virus for hanging up work. It may help to exclude the BOINC folders, but that may not stop the real-time monitoring processes from interfering as they inspect the work, or else the communications on the Internet.
But why would it effect only one in hundreds WU successfully running on the same machines, even concurrently with the blocking WUs? It is usually just one of 4, 6 or 8 concurrent WUs (depending on the number of cores/threads of the host CPU) if it occurs (though I have seen a couple of times two WUs blocking in the manner at the same time) and the occurrence seems rather random... confused

Ralf
----------------------------------------

[Mar 6, 2020 6:36:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7660
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

You say neither suspending nor rebooting has solved the problem. Have you tried a complete shutdown and then a cold start ?
After the cold start, I would snooze BOINC and see if any unknown processes are running and/or using an inordinate amount of cpu cycles. Then I would suspend every running work unit and only leave the affected one running and then see if it progresses normally as the only one running.
The only other idea that comes to mind is a memory problem which is causing a memory bottle neck someplace.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Mar 6, 2020 10:36:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Never ending WUs blocking slots

It seems we had this problem (or one similar) where a work unit got to the very end and hung. I sent in the directory listing from the slot directory showing that it had finished the work unit but had not started compressing the results for transmission. Never received any response to the post.
EDIT: It was in the ARP project where a work unit was showing as running but no process was known to the OS. Looked like BOINC started the WU but never transferred control to the executable. A little different scenario...
----------------------------------------
[Edit 1 times, last edit by Doneske at Mar 6, 2020 11:17:23 PM]
[Mar 6, 2020 11:09:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 19   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread