World Community Grid - View Thread - Cpu time vs Elapsed time

World Community Grid Forums

Category: Support

Forum: GPU Support Forum

Thread: Cpu time vs Elapsed time - GPU work

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 9

[ ]

Author

This topic has been viewed 1160 times and has 8 replies

andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 184
Status: Offline
Project Badges:

2 year badge for The Clean Energy Project - Phase 2

200 year badge for Mapping Cancer Markers

10 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project

200 year badge for OpenPandemics - COVID-19


Cpu time vs Elapsed time - GPU work

Could some knowledgeable people elaborate a bit on this topic please.
On some rigs these times are equal and on some rigs they differ a lot.
What does this say? And are there things to tweak to optimize depending on if the times are close or far?
I have been playing around a bit with multiple WUs on a GPU, different CPU allocations etc. But haven't really got my head around it.
If someone could share their insight on this topic it would be very interesting.

----------------------------------------

/andgra

[Apr 3, 2024 12:07:20 PM]

Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7716
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

14 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

10 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

45 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for Computing for Clean Water

5 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

5 year badge for Uncovering Genome Mysteries

10 year badge for FightAIDS@Home - Phase 2

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Re: Cpu time vs Elapsed time - GPU work

Boinc runs at the lowest priority, so it gets out of the way if you are doing something else on your machine. The cpu time is the time Boinc is actually running its job while the elapsed time is the actual time from when the job started to when the job is finished. If your machine is busy with other tasks much of the time, the elapsed time will be significantly greater than the cpu time.
Hope this helps.

Cheers

----------------------------------------

Sgt. Joe
*Minnesota Crunchers*

[Apr 3, 2024 1:31:08 PM]

andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 184
Status: Offline
Project Badges:


Re: Cpu time vs Elapsed time - GPU work

For CPU work I totally agree with you.
But for GPU work I'm interested in the correlation between these times. I know the CPU is "feeding" the GPU and therefore it doesn't do the calculation itself. But how it could differ so much and how I could use this to possibly tweak settings by understanding why.
Most of the rigs are dedicated to crunching.

----------------------------------------

/andgra

[Apr 3, 2024 1:59:36 PM]

Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 153
Status: Offline
Project Badges:

10 year badge for Smash Childhood Cancer

5 year badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: Cpu time vs Elapsed time - GPU work

Are you referring specifically to GPU project(s) on here (aka: OPNG) or more in general? You are right, both are used but it will depend on the project how much of each is used and when. From what I can see from the OPNG work units which are opencl and not CUDA, there is a LOT of gaps in the GPU processing. This could potentially be from a small calculation happening on the CPU during the gaps in GPU utilization but there is very little bus usage on the GPU during this, so there is not massive data flow between the GPU and CPU for these tasks. Every GPU project will have a bottleneck and determining what that is can greatly improve throughput.

Observations from OPNG work:
- The memory bandwidth of the GPU does not have a large impact on the speed and the memory utilization is extremely small (impact of memory bandwidth tested on Nvidia P100 GPUs).
- The CPU does not impact this work much because CPU utilization is minimal. We run OPNG work with .5 CPU utilization per work unit and I still think this is entirely overkill.
- GPU utilization ("graphics") fluctuates from 100% to almost 0% very rapidly during a OPNG work unit. Because of this, there is a LOT of down time on the GPU. To maximize throughput, multiple OPNG work units can be run simultaneously to try and prevent downtime for the GPU.
- How many that can be run will depend on the client, but I would suggest more than 5 at the same time based on what we have seen here. This will use the GPU to a fuller extent. We run 7x and still see fluctuations in GPU utilization during this (but minimized compared to running 1x). This is NOT the "magic number" for how many everyone else should run but it works for our GPUs. When OPNG work arrives, almost all of it starts simultaneously on our systems since they seem to come in "bundles" of a few work units.

There are not a whole lots of tweaks that can be done on OPNG work besides changing the concurrent work multiplier using the app_config. Also, because it is opencl, there is not much that can be changed on a nvidia GPU or the drivers that I have ever seen.

I am definitely not an expert in any way- these are just our observations. Not sure if this answers your question though.

[Apr 3, 2024 2:40:26 PM]

andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 184
Status: Offline
Project Badges:


Re: Cpu time vs Elapsed time - GPU work

Thanx for your reply. Also to Sgt.Joe earlier!
Yes, I am referring to current OPNG jobs.
It gave me some more insights and also confirmation on my own findings.
Some statistical findings:
- Ryzen 3700x + GTX980: Same CPU/elapsed time. Need multiple simultaneous jobs to keep the GPU utilized. 0.5 CPU/job setting. Around 0.1h/job elapsed. Windows.
- 4790 + GTX760: Same CPU/elapsed time. Only one job at the time. 100% utilized. Slow...0.5-1h/job. Linux.
- 3770 + GTX1650: Same CPU/elapsed time. Only one job at the time. Not fully utilized. Around 0.1h/job. Linux.
- 3770 + GT1030: Around 25-30% longer elapsed time than CPU time. Only one job at the time. Not fully utilized. Around 0.2-0.4h/job. Windows.
- 4770 + some really old Radeon: 10-15 times elapsed vs CPU time. Only one job at the time. Not fully utilized. Around 0.6-1.5h/job. Windows.
- 3217u w integrated GPU: 15 times elapsed vs CPU time. Only one job at the time. Not fully utilized. Around 1.5-3h/job. Windows.
This will have to do for statistics smile

I guess the only way is to play around with multiple jobs and CPU share to find the optimum for each rig as they a quite different.

----------------------------------------

/andgra

[Apr 3, 2024 3:27:58 PM]

Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 153
Status: Offline
Project Badges:


Re: Cpu time vs Elapsed time - GPU work

I guess the only way is to play around with multiple jobs and CPU share to find the optimum for each rig as they a quite different.

Agreed- play around with it. Definitely watch the "wall clock time" versus anything that is reported.

[Apr 3, 2024 4:21:14 PM]

Paul Schlaffer
Senior Cruncher
USA
Joined: Jun 12, 2005
Post Count: 251
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding - Phase 2

180 day badge for Discovering Dengue Drugs - Together

180 day badge for Nutritious Rice for the World

2 year badge for Help Fight Childhood Cancer

14 day badge for Influenza Antiviral Drug Search

50 year badge for The Clean Energy Project - Phase 2

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

100 year badge for Mapping Cancer Markers

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for OpenPandemics - COVID-19


Re: Cpu time vs Elapsed time - GPU work

Since you're just referring to OPNG, those tweaks only become relevant when you have a steady flow of incoming WU to keep the GPU's busy. My Workstation has dual workstation class GPU cards, and I haven't seen that in a very, very, very long time. Therefore, the juice isn't worth the squeeze, as you're not increasing the work successfully completed.

----------------------------------------

“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)

[Apr 3, 2024 5:43:44 PM]

Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 153
Status: Offline
Project Badges:


Re: Cpu time vs Elapsed time - GPU work

No doubt. I keep these settings in the "hopes" that someday, sometime, OPNG work will flow again. Maybe they are unfounded... but I still keep them.

We also only get packets of these work units every once in a while.

[Apr 3, 2024 6:16:59 PM]

alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 995
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

180 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

1 year badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

10 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: Cpu time vs Elapsed time - GPU work

Harking back to the original post, and noting andgra's subsequent responses, it may be worth noting that NVIDIA's [sub-optimal] OpenCl implementation will result in CPU time close to elapsed time, unlike AMD drivers...

Also, for typical OPNG tasks GPU idle time is in two categories, set-up/wrap-up and inter-job; given that set-up (running AutoGrid) typically takes seconds rather than minutes, and inter-job activity and wrapup times are both very short, there's not a lot of spare capacity to pick up unless one has a far better GPU than mine (GTX 1050Ti and GTX 1660 Ti) -- I found [long ago] that running "two at once" on the 1650 gave a small but not very significant improvement in throughput (at the expense of a constantly howling GPU fan!) so I gave up!

Cheers - Al.

P.S. One of the Einstein projects (BRP7 [Meerkat]) has recently switched from an OpenCL application to a CUDA application for NVIDIA. CPU time is now typically 35 to 40 seconds for either GPU, with elapsed times around 1000 seconds for the 1660 and 2100 seconds for the 1050 -- the CUDA app is only about 5 to 10% faster than its OpenCL predecessor, though!

[Apr 3, 2024 7:05:19 PM]

[ ]