Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 781
Posts: 781   Pages: 79   [ Previous Page | 44 45 46 47 48 49 50 51 52 53 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 946409 times and has 780 replies Next Thread
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

to me it shows fairly poor optimization of the application. the ideal situation is that the application uses as much of the GPU as possible to get the most work done. needing 16 CPU cores to feed a single GPU is absurdly high. most other projects use 1 or less to feed a GPU and keep high utilization the entire time. You shouldn't have to give up 16 cores (along with the power consumption that comes with that) that could otherwise be doing something more useful like crunching CPU tasks for a project without a GPU app. Look at GPUGRID or Einstein. those are how you want your app to operate. able to feed the GPU to 95+% for the entire run with only a single CPU core to keep the GPU busy. usually this means preloading more data into the GPU memory and making the GPU handle more functions.

I know it's a "first cut" for this app, but it still has a long way to go for efficiency in my opinion. we should all push for better utilization of resources for the sake of efficiency and not accept so much waste.

Uplinger addressed this a while ago. He wanted to keep the GPU work units the same as the CPU work units initially to ensure consistent results, no doubt necessary for the science.
He said he would tweak it up later.


this isnt correct. the GPU WUs are not the same as the CPU WUs. the GPU tasks have many many more tasks prepackaged and are actually much larger than the CPU tasks. and GPU tasks cannot crossvalidate with GPU tasks because of their differences.

the GPU app optimization has nothing to do with this.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
[Apr 29, 2021 8:30:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I'm experiencing some strange behaviour after modifying the app_config file.

I forced BOINC to run up to 8 GPU workunits in parallel:

<gpu_usage>0.125</gpu_usage>
<cpu_usage>0.25</cpu_usage>


This works absolutely fine. I run both GPU and CPU workunits and my GPU and CPU are able to process that many in parallel. This obviously has a dramatic effect on throughput.

However the BOINC client is not able to fetch GPU workunits anymore. It tries to fetch both CPU and GPU workunits but only receives CPU workunits. Anybody who experienced the same?


Yes... I'm finding the same thing.
I've woken this morning to once again find my pc run dry of GPU work :/ It's just not fetching more GPU work.


I asked m0320174 these questions, but he never replied. so I'll ask you the same since you're having the same issue.

what are your cache settings? and what does the Event Log say during work fetch? it will usually list a reason for not requesting work. or a reason why they aren't sending you any.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
----------------------------------------
[Edit 1 times, last edit by Ian-n-Steve C. at Apr 29, 2021 8:34:55 PM]
[Apr 29, 2021 8:33:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Chooka
Cruncher
Australia
Joined: Jan 25, 2017
Post Count: 49
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I hope you all can greatly increase GPU units after the stress test and keep this going. I am highly tempted to go buy an overpriced card.

From the numbers I have seen, the higher-end cards don't get you much more performance. Maybe someone here with an RTX, for example, could show what they are getting.


I have a RTX 2080. I can only get to 100% GPU load if I run 16 concurrently and use 16 vCPUs to support it. Takes both to 100% virtually non-stop. I tried 12, 8, 4, 2 and 1. 16 seems to be the sweet spot but being both are almost always 100% I don't think I can get more out of it.


what a waste of resources.

I wouldn't put it that way. Each CPU core is then supporting another GPU work unit, which is then doing the work of around 100 CPU work units. It is as good a trade-off as you will get anywhere in the BOINC world.


to me it shows fairly poor optimization of the application. the ideal situation is that the application uses as much of the GPU as possible to get the most work done. needing 16 CPU cores to feed a single GPU is absurdly high. most other projects use 1 or less to feed a GPU and keep high utilization the entire time. You shouldn't have to give up 16 cores (along with the power consumption that comes with that) that could otherwise be doing something more useful like crunching CPU tasks for a project without a GPU app. Look at GPUGRID or Einstein. those are how you want your app to operate. able to feed the GPU to 95+% for the entire run with only a single CPU core to keep the GPU busy. usually this means preloading more data into the GPU memory and making the GPU handle more functions.

I know it's a "first cut" for this app, but it still has a long way to go for efficiency in my opinion. we should all push for better utilization of resources for the sake of efficiency and not accept so much waste.


I second this.
Einstein & Milkyway are great when it comes to GPU utilisation.
----------------------------------------


[Apr 29, 2021 8:34:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kittyman
Advanced Cruncher
Joined: May 14, 2020
Post Count: 140
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I know it's a "first cut" for this app, but it still has a long way to go for efficiency in my opinion. we should all push for better utilization of resources for the sake of efficiency and not accept so much waste.

Ahem..... You don't have to 'accept' anything. And unless you are willing and able to step up to the plate and help rewrite some code, I would tone down the rhetoric just a tad. You tend to go beyond 'constructive' criticism to demeaning the manner in which this project is being run.

If you don't find the coding efforts of the staff on this project acceptable to your standards, please do feel free to sail your ship to some other shores to comment about how they are running their projects.

MEOW.
----------------------------------------

[Apr 29, 2021 8:35:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Richard Haselgrove
Senior Cruncher
United Kingdom
Joined: Feb 19, 2021
Post Count: 360
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I once posted at another site:
The very best programmers design for every eventuality, error trap everything, write full and clear error messages, and provide full documentation. Their programs are self-evident and never fail, so the error messages and documentation are completely redundant. These programmers are usually unemployed, because they are too slow and too expensive.

Good programmers consider their users - all their users - and design accordingly.

Poor programmers can't see the wider picture, and only design for "people like us".
To which I can add:
Busy programmers, writing their very first GPU app in the teeth of a pandemic, concentrate on getting the damn thing to work at all. The pretty bits can be added later.

[Apr 29, 2021 8:42:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Chooka
Cruncher
Australia
Joined: Jan 25, 2017
Post Count: 49
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Hi Ian n Steve. Thanks for the reply.

Cache settings in WCG is set to 2 days work. It was at 7 days but either way I had the same issue.
The only message I see that could be of concern is - "Not requesting tasks - Too many runnable tasks"

I've seen this on another pc also.
----------------------------------------


[Apr 29, 2021 8:43:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I know it's a "first cut" for this app, but it still has a long way to go for efficiency in my opinion. we should all push for better utilization of resources for the sake of efficiency and not accept so much waste.

Ahem..... You don't have to 'accept' anything. And unless you are willing and able to step up to the plate and help rewrite some code, I would tone down the rhetoric just a tad. You tend to go beyond 'constructive' criticism to demeaning the manner in which this project is being run.

If you don't find the coding efforts of the staff on this project acceptable to your standards, please do feel free to sail your ship to some other shores to comment about how they are running their projects.

MEOW.

I'll MEOW my way. you MEOW yours.

kthnx
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
[Apr 29, 2021 8:43:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JohnDK
Advanced Cruncher
Denmark
Joined: Feb 17, 2010
Post Count: 78
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

If it's possible to optimize the app, then of course it should be on the to do list, otherwise it seems a waste of resources.

Is there some beta tests in the pipeline?
----------------------------------------
Intel i7-6850K / 16GB / RTX 3090 / 2x RTX 3080 Ti / RTX 3070 Ti
AMD Ryzen 9 5950X / 32GB / RTX 2080 Ti
[Apr 29, 2021 8:47:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Hi Ian n Steve. Thanks for the reply.

Cache settings in WCG is set to 2 days work. It was at 7 days but either way I had the same issue.
The only message I see that could be of concern is - "Not requesting tasks - Too many runnable tasks"

I've seen this on another pc also.


then that's your answer. you have too many tasks from the project to be sent more. I'm guessing you're loaded up on CPU tasks.

I'm only running GPU tasks (CPU processing disabled), and I observe a 200 task limit. is your CPU+GPU equating to 200 tasks? I think this is probably from the resource share issue. unfortunately

you could for sure work around this by running multiple clients (which is a bit of a can of worms in itself) one with only CPU work, and one with only GPU work, or maybe playing around with the resource share value between projects. there are options in BOINC to control how many tasks of each type are running at a time, but other than a cache setting (which I'm sure is shared between OPN1 and OPNG since it's the same project) theres no way to tell the project "only send me X amount of CPU tasks"
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
----------------------------------------
[Edit 1 times, last edit by Ian-n-Steve C. at Apr 29, 2021 8:52:16 PM]
[Apr 29, 2021 8:48:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kittyman
Advanced Cruncher
Joined: May 14, 2020
Post Count: 140
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I know it's a "first cut" for this app, but it still has a long way to go for efficiency in my opinion. we should all push for better utilization of resources for the sake of efficiency and not accept so much waste.

Ahem..... You don't have to 'accept' anything. And unless you are willing and able to step up to the plate and help rewrite some code, I would tone down the rhetoric just a tad. You tend to go beyond 'constructive' criticism to demeaning the manner in which this project is being run.

If you don't find the coding efforts of the staff on this project acceptable to your standards, please do feel free to sail your ship to some other shores to comment about how they are running their projects.

MEOW.

I'll MEOW my way. you MEOW yours.

kthnx

Nobody minds a little meowing......
But I think a few folks are getting tired of the caterwaulling.

Meow.
----------------------------------------

[Apr 29, 2021 8:48:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 781   Pages: 79   [ Previous Page | 44 45 46 47 48 49 50 51 52 53 | Next Page ]
[ Jump to Last Post ]
Post new Thread