Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 4
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2255 times and has 3 replies Next Thread
Xenu666
Cruncher
Joined: Sep 1, 2006
Post Count: 4
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
How do I force Boinc to use all CPU's on a 96 CPU machine?

Hi all
I would guess some of you already has encountered the problem I have...

I happen to be able to use a couple of 96 CPU HPE DL380 G10 servers with dual Xeon Platinum 8160 CPU's running Win Server 2019 standard edition. That makes 2 x 24 cores + Hyper Threading = 96 logical CPU's.

My problem is that Boinc doesn't seem to be able (by itself) to utilize all logical CPU's.
In Performance monitor the physical CPU's are presented as NUMA node 0 & 1 and both of them have CPU 0->47.

Boinc recognizes that the machine has 96 logical CPU's and therefore starts 96 processes to work on simultaneously. However, those processes doesn't get assigned to one logical CPU each, rather the 96 processes is handled by a subset of them. Sometimes 85 logical CPU's are used to handle all 96 processes, sometimes only 70, or 65. In all cases there is always one NUMA node fully occupied with more processes than CPU's and the other, at the same time, has several logical CPU's idle.

I would guess that the total computing power is somewhere around 75-80% utilized.

I can, however, manually assign each and every Boinc process to either NUMA node 0 or 1 in Task Manager -> Details -> Right click on process and "Set affinity" to either group (NUMA node) 0 or 1 to make all logical CPU's work. Once the process is assigned to the least working NUMA node it will automagically be assigned to any free logical CPU.

So... now to the question:
Is there a way to make either Boinc or Windows to handle this "Set affinity" for me since I'm not able to baby sit my machines 24/7 to make them work as hard as they should?


PS
On machines (at least on HPE DL380) one can set NUMA and Memory settings (in BIOS) to something like "Flat" to make it possible to ignore the NUMA node settings. This makes the processes assigned to either NUMA node 0 or 1, I think it says "All" (groups) in task manager. However, there seems to be a limit of max 64 logical CPU's per NUMA Node so in the case of a machine with 96 of those that setting ends up with NUMA node 0 having 64 CPU's and NUMA node 1 having 32 CPU's. This is therefore a working solution on machines with up to 64 CPU's but not for for a machine with 80, 96 or more CPU's.

Thanks in advance :)
[Aug 9, 2021 9:39:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Stiwi
Advanced Cruncher
Joined: May 19, 2012
Post Count: 75
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: How do I force Boinc to use all CPU's on a 96 CPU machine?

More than 64 Cores/Threads on Windows with boinc is problematic.

https://github.com/BOINC/boinc/issues/1357
https://boinc.berkeley.edu/trac/wiki/WinMulticore

As far as I know there is no solution.
[Aug 9, 2021 10:02:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Falconet
Master Cruncher
Portugal
Joined: Mar 9, 2009
Post Count: 3315
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: How do I force Boinc to use all CPU's on a 96 CPU machine?

I think I've seen this workaround being put forward with regards to the issue you describe - https://boinc.berkeley.edu/dev/forum_thread.php?id=8430&postid=49475#49475
----------------------------------------


- AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W
- AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W
- AMD Ryzen 7 7730U 8C/16T 3.0 GHz
[Aug 9, 2021 10:19:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Xenu666
Cruncher
Joined: Sep 1, 2006
Post Count: 4
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: How do I force Boinc to use all CPU's on a 96 CPU machine?

Thank you for your answers :)

@Stiwi - Yes, it seems to be problematic. I thought that using the 'Set affinity' in Task manager in some automated way would do the work but on the second link you provided they state that they will not use that function:

"We're not going to use processor affinity. If a job runs in a processor group, its affinity mask is all the processors in that group."

I don't know why they think that's a bad idea but probably because I don't know that much about it. I thought it would solve the problem since it does when I do it manually...

@Falconet - I gave it a shot and for now it seems to work pretty well. What I did was to create the cc_config.xml file and made it like this:

<cc_config>
<options>
<ncpus>128</ncpus>
</options>
</cc_config>

This tricks Boinc to think the machine has 128 CPU's and thus starts 128 processes. From what I read from the links provided and sometimes links from there I understand it that Windows spawns all new processes Round Robin over existing NUMA groups. Round Robin is not that precise I would guess, it's rather some kind of 'best effort' to spread to computing load it seems.

My thought was if I flood the system with more threads than there are CPU's, even a 'not that precise' Round Robin approach should eventually hit all logical CPU's.

This will of course make all jobs take longer time to complete but hopefully, going from aprox 75% CPU load to 100%, I will gain more work done even if I will loose some efficiency forcing Windows to jump between jobs to eventually get them complete.

I'll see in a day or so if the machine will get more work done after this change.

Hopefully :D
[Aug 9, 2021 11:46:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread