World Community Grid - View Thread - Is it time to revisit the 35 WU limit?

World Community Grid Forums

Category: Support

Forum: BOINC Agent Support

Thread: Is it time to revisit the 35 WU limit?

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 34

[ ]

Author

This topic has been viewed 8297 times and has 33 replies

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:

45 day badge for Help Cure Muscular Dystrophy

20 year badge for Mapping Cancer Markers

1 year badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: Is it time to revisit the 35 WU limit?

Doneske said:

Maybe there needs to be a parameter that can be entered at start up that allows the client to cater to larger thread count systems. Such as --LARGE_THREAD_COUNT that would tell the client to use larger limits both on the server and client side. By using a parameter, they wouldn't have to maintain different clients. It would be off by default. Just thinking out loud.

I like this idea. Feel free to add it to the Issue I created or maybe show some support with a Like or something. I didn't know David Anderson was against it in the past. I hope with the new CPU landscape that he'll be less resistant in 2019.

I made the case that even with a 16-thread system, it quickly can go through 1000 work units in 6-8 hours. To buffer just an entire day's worth of work on that machine, it would need to have 3000-4000 or so. I like to buffer 1-2 days personally in case I lose Internet connectivity or there's a WCG outage.

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 1 times, last edit by hchc at Sep 21, 2019 1:17:01 AM]

[Sep 21, 2019 1:06:34 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Is it time to revisit the 35 WU limit?

I updated the client and re-compiled on my largest machine. The 1000 per client is gone now. The client just downloaded about 4700 WUs as expected. I set the value at 5000 to start with. The most I can get from WCG on the big machine is 8960 (70 x 128) so the 5000 is good for now. However, once the new dual socket servers with the 7702 processors are available, that will increase to 17920 (70 x 256 )

[Oct 1, 2019 2:13:08 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Is it time to revisit the 35 WU limit?

What you get in highly amplified fashion if some fleet owner blows up his configuration.

9300 World Community Grid 10/3/2019 5:20:02 PM Tasks are committed to other platforms

Zika ran dry on my sole machine (windows)... it's idling now and darned if I start jumping hoops to mitigate.

----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 3, 2019 3:25:27 PM]

[Oct 3, 2019 3:24:49 PM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:


Re: Is it time to revisit the 35 WU limit?

Just curious, but what change to the code did you make before compiling? Did you just change the 1000 to 5000 in client_state.h?

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 1 times, last edit by hchc at Oct 4, 2019 4:53:15 AM]

[Oct 4, 2019 4:52:26 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Is it time to revisit the 35 WU limit?

Yep, that was it. A single one line change..

[Oct 4, 2019 4:07:22 PM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:


Re: Is it time to revisit the 35 WU limit?

@Doneske (Former User) said in 2019:

The 1000 limit has been brought up a couple of times in the BOINC message boards and I think David Anderson didn't want to change it. I compiled a boinc client on a Centos 7 system when I couldn't find a distributed client that would work due to library differences or libraries missing entirely. The issue then becomes keeping it updated. Admittedly, the client probably doesn't need to be updated that much but it would once in a while and if you have a significant number of hosts, that becomes a chore. If I was more familiar with module mapping from the linker, it would be worth trying to find the constant in a binary module and just zapping it to a different value. It may be worth bringing it up again as AMD is changing the landscape with the high core count EPYC, Threadripper, and Ryzen. Intel isn't far behind. I'm just wondering if the BOINC ecosystem is becoming slightly tiered in the respect that there are still many, many systems under 32 cores but there are also a growing number of high core count systems entering the environment. Maybe there needs to be a parameter that can be entered at start up that allows the client to cater to larger thread count systems. Such as --LARGE_THREAD_COUNT that would tell the client to use larger limits both on the server and client side. By using a parameter, they wouldn't have to maintain different clients. It would be off by default. Just thinking out loud.

I'm bumping this thread in 2024 because the GitHub issue I opened (see the link a few posts above) finally got attention, then immediately got rejected and "closed as completed."

David Anderson in 2024 still is against it, as you said in 2019. But it's even more relevant with the passage of time and increasingly more powerful CPUs coming to market. Now the 16c/32t Ryzens are mainstream, which used to be Threadripper and EPYC territory. And modern Threadrippers and EPYCs are getting to 192c or 256c. So That can lead to a 512-thread machine one ONE system. With a damned 1000 limit hard-coded, it would make it impossible for that one system to buffer even 0.5 days of work, and that risks large periods of running out of work and large periods of downtime. It's bull*** stubbornness.

Improve logic behind WF_MAX_RUNNABLE_JOBS = 1000

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 1 times, last edit by hchc at Mar 9, 2024 4:16:20 PM]

[Mar 9, 2024 4:15:19 PM]

thunder7
Senior Cruncher
Netherlands
Joined: Mar 6, 2013
Post Count: 241
Status: Offline
Project Badges:

180 day badge for The Clean Energy Project - Phase 2

180 day badge for Drug Search for Leishmaniasis

90 day badge for GO Fight Against Malaria

200 year badge for Mapping Cancer Markers

1 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

50 year badge for FightAIDS@Home - Phase 2

10 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

50 year badge for OpenPandemics - COVID-19


Re: Is it time to revisit the 35 WU limit?

The limit is in the client source code. This code is open source. Fork it if you don't like it.

[Mar 9, 2024 4:41:27 PM]

gj82854
Advanced Cruncher
Joined: Sep 26, 2022
Post Count: 122
Status: Offline
Project Badges:

10 year badge for Mapping Cancer Markers

10 year badge for Africa Rainfall Project


Re: Is it time to revisit the 35 WU limit?

Those of us who have done this in the past have found that this has become a non-trivial task for a lot of different reasons. Plus projects (PrimeGrid) are starting to blacklist "hacked" clients. Nonsensical for such a trivial change.

[Mar 10, 2024 4:07:53 PM]

bluestang
Senior Cruncher
USA
Joined: Oct 1, 2010
Post Count: 274
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

2 year badge for Help Fight Childhood Cancer

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

50 year badge for Smash Childhood Cancer

14 day badge for Africa Rainfall Project


Re: Is it time to revisit the 35 WU limit?

David Anderson is too busy pimping Science United on the BOINC site to be bothered with improving BOINC itself.

He's proven time and time again he is a stubborn, pompous arse.

And for those that are capable of compiling their own version with a larger limit...make it available for others to easily find and get.

----------------------------------------

https://xs4s.org/index.php
https://discord.gg/ePTkyue2

[Mar 10, 2024 9:46:31 PM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1337
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

10 year badge for Microbiome Immunity Project


Re: Is it time to revisit the 35 WU limit?

Three simple questions for all those who want larger limits; whether the second and third questions are relevant depends on the answer to the first...

Does the server impose a maximum limit of jobs per host or application?
If there's no server-side limit, is it acceptable for a single user who sets a very large client limit on a system with 120+ threads, each of which can process [say] 15 tasks a day, to build up a cache of well over 5,000+ tasks because they they want to have enough work to survive a three-day outage (60+ tasks per thread...)?
If big rig users can collect many thousands of tasks, how many big rigs would it take to swallow all the available work, leaving nothing for "normal" volunteers? (And please don't say "They could generate more work...")

Just curious as to what big-rig users think on the matter...

Cheers - Al.

P.S. I agree with the comments about D.A. :-) -- I remember the problems Richard Haselgrove and co. had trying to get acknowledgment and a fix for the problem of runaway client work fetches if max_concurrent was used.

[Mar 11, 2024 4:32:04 AM]

[ ]