Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1160 times and has 8 replies
andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 184
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Cpu time vs Elapsed time - GPU work

Could some knowledgeable people elaborate a bit on this topic please.
On some rigs these times are equal and on some rigs they differ a lot.
What does this say? And are there things to tweak to optimize depending on if the times are close or far?
I have been playing around a bit with multiple WUs on a GPU, different CPU allocations etc. But haven't really got my head around it.
If someone could share their insight on this topic it would be very interesting.
----------------------------------------
/andgra



[Apr 3, 2024 12:07:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Boinc runs at the lowest priority, so it gets out of the way if you are doing something else on your machine. The cpu time is the time Boinc is actually running its job while the elapsed time is the actual time from when the job started to when the job is finished. If your machine is busy with other tasks much of the time, the elapsed time will be significantly greater than the cpu time.
Hope this helps.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Apr 3, 2024 1:31:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 184
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

For CPU work I totally agree with you.
But for GPU work I'm interested in the correlation between these times. I know the CPU is "feeding" the GPU and therefore it doesn't do the calculation itself. But how it could differ so much and how I could use this to possibly tweak settings by understanding why.
Most of the rigs are dedicated to crunching.
----------------------------------------
/andgra



[Apr 3, 2024 1:59:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Are you referring specifically to GPU project(s) on here (aka: OPNG) or more in general? You are right, both are used but it will depend on the project how much of each is used and when. From what I can see from the OPNG work units which are opencl and not CUDA, there is a LOT of gaps in the GPU processing. This could potentially be from a small calculation happening on the CPU during the gaps in GPU utilization but there is very little bus usage on the GPU during this, so there is not massive data flow between the GPU and CPU for these tasks. Every GPU project will have a bottleneck and determining what that is can greatly improve throughput.

Observations from OPNG work:
- The memory bandwidth of the GPU does not have a large impact on the speed and the memory utilization is extremely small (impact of memory bandwidth tested on Nvidia P100 GPUs).
- The CPU does not impact this work much because CPU utilization is minimal. We run OPNG work with .5 CPU utilization per work unit and I still think this is entirely overkill.
- GPU utilization ("graphics") fluctuates from 100% to almost 0% very rapidly during a OPNG work unit. Because of this, there is a LOT of down time on the GPU. To maximize throughput, multiple OPNG work units can be run simultaneously to try and prevent downtime for the GPU.
- How many that can be run will depend on the client, but I would suggest more than 5 at the same time based on what we have seen here. This will use the GPU to a fuller extent. We run 7x and still see fluctuations in GPU utilization during this (but minimized compared to running 1x). This is NOT the "magic number" for how many everyone else should run but it works for our GPUs. When OPNG work arrives, almost all of it starts simultaneously on our systems since they seem to come in "bundles" of a few work units.

There are not a whole lots of tweaks that can be done on OPNG work besides changing the concurrent work multiplier using the app_config. Also, because it is opencl, there is not much that can be changed on a nvidia GPU or the drivers that I have ever seen.

I am definitely not an expert in any way- these are just our observations. Not sure if this answers your question though.
[Apr 3, 2024 2:40:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 184
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Thanx for your reply. Also to Sgt.Joe earlier!
Yes, I am referring to current OPNG jobs.
It gave me some more insights and also confirmation on my own findings.
Some statistical findings:
- Ryzen 3700x + GTX980: Same CPU/elapsed time. Need multiple simultaneous jobs to keep the GPU utilized. 0.5 CPU/job setting. Around 0.1h/job elapsed. Windows.
- 4790 + GTX760: Same CPU/elapsed time. Only one job at the time. 100% utilized. Slow...0.5-1h/job. Linux.
- 3770 + GTX1650: Same CPU/elapsed time. Only one job at the time. Not fully utilized. Around 0.1h/job. Linux.
- 3770 + GT1030: Around 25-30% longer elapsed time than CPU time. Only one job at the time. Not fully utilized. Around 0.2-0.4h/job. Windows.
- 4770 + some really old Radeon: 10-15 times elapsed vs CPU time. Only one job at the time. Not fully utilized. Around 0.6-1.5h/job. Windows.
- 3217u w integrated GPU: 15 times elapsed vs CPU time. Only one job at the time. Not fully utilized. Around 1.5-3h/job. Windows.
This will have to do for statistics smile
I guess the only way is to play around with multiple jobs and CPU share to find the optimum for each rig as they a quite different.
----------------------------------------
/andgra



[Apr 3, 2024 3:27:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Thanx for your reply. Also to Sgt.Joe earlier!
Yes, I am referring to current OPNG jobs.
It gave me some more insights and also confirmation on my own findings.
Some statistical findings:
- Ryzen 3700x + GTX980: Same CPU/elapsed time. Need multiple simultaneous jobs to keep the GPU utilized. 0.5 CPU/job setting. Around 0.1h/job elapsed. Windows.
- 4790 + GTX760: Same CPU/elapsed time. Only one job at the time. 100% utilized. Slow...0.5-1h/job. Linux.
- 3770 + GTX1650: Same CPU/elapsed time. Only one job at the time. Not fully utilized. Around 0.1h/job. Linux.
- 3770 + GT1030: Around 25-30% longer elapsed time than CPU time. Only one job at the time. Not fully utilized. Around 0.2-0.4h/job. Windows.
- 4770 + some really old Radeon: 10-15 times elapsed vs CPU time. Only one job at the time. Not fully utilized. Around 0.6-1.5h/job. Windows.
- 3217u w integrated GPU: 15 times elapsed vs CPU time. Only one job at the time. Not fully utilized. Around 1.5-3h/job. Windows.
This will have to do for statistics smile
I guess the only way is to play around with multiple jobs and CPU share to find the optimum for each rig as they a quite different.


Agreed- play around with it. Definitely watch the "wall clock time" versus anything that is reported.
[Apr 3, 2024 4:21:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Paul Schlaffer
Senior Cruncher
USA
Joined: Jun 12, 2005
Post Count: 251
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Since you're just referring to OPNG, those tweaks only become relevant when you have a steady flow of incoming WU to keep the GPU's busy. My Workstation has dual workstation class GPU cards, and I haven't seen that in a very, very, very long time. Therefore, the juice isn't worth the squeeze, as you're not increasing the work successfully completed.
----------------------------------------

“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
[Apr 3, 2024 5:43:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Since you're just referring to OPNG, those tweaks only become relevant when you have a steady flow of incoming WU to keep the GPU's busy. My Workstation has dual workstation class GPU cards, and I haven't seen that in a very, very, very long time. Therefore, the juice isn't worth the squeeze, as you're not increasing the work successfully completed.



No doubt. I keep these settings in the "hopes" that someday, sometime, OPNG work will flow again. Maybe they are unfounded... but I still keep them.

We also only get packets of these work units every once in a while.
[Apr 3, 2024 6:16:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 995
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Cpu time vs Elapsed time - GPU work

Harking back to the original post, and noting andgra's subsequent responses, it may be worth noting that NVIDIA's [sub-optimal] OpenCl implementation will result in CPU time close to elapsed time, unlike AMD drivers...

Also, for typical OPNG tasks GPU idle time is in two categories, set-up/wrap-up and inter-job; given that set-up (running AutoGrid) typically takes seconds rather than minutes, and inter-job activity and wrapup times are both very short, there's not a lot of spare capacity to pick up unless one has a far better GPU than mine (GTX 1050Ti and GTX 1660 Ti) -- I found [long ago] that running "two at once" on the 1650 gave a small but not very significant improvement in throughput (at the expense of a constantly howling GPU fan!) so I gave up!

Cheers - Al.

P.S. One of the Einstein projects (BRP7 [Meerkat]) has recently switched from an OpenCL application to a CUDA application for NVIDIA. CPU time is now typically 35 to 40 seconds for either GPU, with elapsed times around 1000 seconds for the 1660 and 2100 seconds for the 1050 -- the CUDA app is only about 5 to 10% faster than its OpenCL predecessor, though!
[Apr 3, 2024 7:05:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread