| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 34
|
|
| Author |
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
Doneske said:
----------------------------------------Maybe there needs to be a parameter that can be entered at start up that allows the client to cater to larger thread count systems. Such as --LARGE_THREAD_COUNT that would tell the client to use larger limits both on the server and client side. By using a parameter, they wouldn't have to maintain different clients. It would be off by default. Just thinking out loud. I like this idea. Feel free to add it to the Issue I created or maybe show some support with a Like or something. I didn't know David Anderson was against it in the past. I hope with the new CPU landscape that he'll be less resistant in 2019. I made the case that even with a 16-thread system, it quickly can go through 1000 work units in 6-8 hours. To buffer just an entire day's worth of work on that machine, it would need to have 3000-4000 or so. I like to buffer 1-2 days personally in case I lose Internet connectivity or there's a WCG outage.
[Edit 1 times, last edit by hchc at Sep 21, 2019 1:17:01 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I updated the client and re-compiled on my largest machine. The 1000 per client is gone now. The client just downloaded about 4700 WUs as expected. I set the value at 5000 to start with. The most I can get from WCG on the big machine is 8960 (70 x 128) so the 5000 is good for now. However, once the new dual socket servers with the 7702 processors are available, that will increase to 17920 (70 x 256 )
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
What you get in highly amplified fashion if some fleet owner blows up his configuration.
----------------------------------------9300 World Community Grid 10/3/2019 5:20:02 PM Tasks are committed to other platforms Zika ran dry on my sole machine (windows)... it's idling now and darned if I start jumping hoops to mitigate. [Edit 1 times, last edit by Former Member at Oct 3, 2019 3:25:27 PM] |
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
I updated the client and re-compiled on my largest machine. The 1000 per client is gone now. The client just downloaded about 4700 WUs as expected. I set the value at 5000 to start with. The most I can get from WCG on the big machine is 8960 (70 x 128) so the 5000 is good for now. However, once the new dual socket servers with the 7702 processors are available, that will increase to 17920 (70 x 256 ) Just curious, but what change to the code did you make before compiling? Did you just change the 1000 to 5000 in client_state.h?
[Edit 1 times, last edit by hchc at Oct 4, 2019 4:53:15 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yep, that was it. A single one line change..
|
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
@Doneske (Former User) said in 2019:
----------------------------------------The 1000 limit has been brought up a couple of times in the BOINC message boards and I think David Anderson didn't want to change it. I compiled a boinc client on a Centos 7 system when I couldn't find a distributed client that would work due to library differences or libraries missing entirely. The issue then becomes keeping it updated. Admittedly, the client probably doesn't need to be updated that much but it would once in a while and if you have a significant number of hosts, that becomes a chore. If I was more familiar with module mapping from the linker, it would be worth trying to find the constant in a binary module and just zapping it to a different value. It may be worth bringing it up again as AMD is changing the landscape with the high core count EPYC, Threadripper, and Ryzen. Intel isn't far behind. I'm just wondering if the BOINC ecosystem is becoming slightly tiered in the respect that there are still many, many systems under 32 cores but there are also a growing number of high core count systems entering the environment. Maybe there needs to be a parameter that can be entered at start up that allows the client to cater to larger thread count systems. Such as --LARGE_THREAD_COUNT that would tell the client to use larger limits both on the server and client side. By using a parameter, they wouldn't have to maintain different clients. It would be off by default. Just thinking out loud. I'm bumping this thread in 2024 because the GitHub issue I opened (see the link a few posts above) finally got attention, then immediately got rejected and "closed as completed." David Anderson in 2024 still is against it, as you said in 2019. But it's even more relevant with the passage of time and increasingly more powerful CPUs coming to market. Now the 16c/32t Ryzens are mainstream, which used to be Threadripper and EPYC territory. And modern Threadrippers and EPYCs are getting to 192c or 256c. So That can lead to a 512-thread machine one ONE system. With a damned 1000 limit hard-coded, it would make it impossible for that one system to buffer even 0.5 days of work, and that risks large periods of running out of work and large periods of downtime. It's bull*** stubbornness. Improve logic behind WF_MAX_RUNNABLE_JOBS = 1000
[Edit 1 times, last edit by hchc at Mar 9, 2024 4:16:20 PM] |
||
|
|
thunder7
Senior Cruncher Netherlands Joined: Mar 6, 2013 Post Count: 238 Status: Offline Project Badges:
|
The limit is in the client source code. This code is open source. Fork it if you don't like it.
|
||
|
|
gj82854
Advanced Cruncher Joined: Sep 26, 2022 Post Count: 122 Status: Offline Project Badges:
|
Those of us who have done this in the past have found that this has become a non-trivial task for a lot of different reasons. Plus projects (PrimeGrid) are starting to blacklist "hacked" clients. Nonsensical for such a trivial change.
|
||
|
|
bluestang
Senior Cruncher USA Joined: Oct 1, 2010 Post Count: 274 Status: Offline Project Badges:
|
David Anderson is too busy pimping Science United on the BOINC site to be bothered with improving BOINC itself.
----------------------------------------He's proven time and time again he is a stubborn, pompous arse. And for those that are capable of compiling their own version with a larger limit...make it available for others to easily find and get. |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
Three simple questions for all those who want larger limits; whether the second and third questions are relevant depends on the answer to the first...
Cheers - Al. P.S. I agree with the comments about D.A. :-) -- I remember the problems Richard Haselgrove and co. had trying to get acknowledgment and a fix for the problem of runaway client work fetches if max_concurrent was used. |
||
|
|
|