World Community Grid - View Thread - Relative Efficiency of Various GPUs on OpenPandemics

World Community Grid Forums

Category: Community

Forum: Hardware Chat Room

Thread: Relative Efficiency of Various GPUs on OpenPandemics - GPU

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 10

[ ]

Author

This topic has been viewed 2851 times and has 9 replies

DrMason
Senior Cruncher
Joined: Mar 16, 2007
Post Count: 153
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy

1 year badge for Discovering Dengue Drugs - Together

180 day badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

180 day badge for Help Fight Childhood Cancer

14 day badge for Influenza Antiviral Drug Search

180 day badge for Help Cure Muscular Dystrophy - Phase 2

45 day badge for The Clean Energy Project - Phase 2

45 day badge for Computing for Clean Water

14 day badge for Drug Search for Leishmaniasis

14 day badge for GO Fight Against Malaria

100 year badge for Mapping Cancer Markers

20 year badge for FightAIDS@Home - Phase 2

50 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

50 year badge for Africa Rainfall Project

100 year badge for OpenPandemics - COVID-19


Relative Efficiency of Various GPUs on OpenPandemics - GPU

I thought I would start compiling a list of how quickly various GPUs finished the new OpenPandemics GPU workunits (OPNG), and some baseline comparisons between various kinds of cards. I also thought I would caution against using online databases to compare cards for the purposes of OPNG. While I only have about 160 data points (so these are very much preliminary figures), I thought we could start cataloging to see if there are any cards that are a particularly good value.

Baseline assumptions: All of my cards have been computing with stock parameters. No overclocking, 1 CPU assigned per OPNG workunit, 1 workunit assigned per GPU. That is to say, I have not yet implemented any changes to <gpu_versions> in my app_config.xml or cc_config.xml files, and I am assuming wingmen also have not. I am also assuming the time reported in the workunit is an accurate measure of how much time the GPU was crunching the unit. Finally, I assume that the computation time linearly increases the more "jobs" each OPNG unit has. So, if it takes a GPU 10 seconds on average to compute one "job", then a workunit that has 50 "jobs" should take 500 seconds.

That said:
I have a Quadro M2000, a GT 1030, a GTX 1650, and an RTX 3080. The average time per "job" of each of these cards is:
- M2000: 13.04 seconds / job (50 data points)
- GT 1030: 15.84 seconds / job (14 data points)
- GTX 1650: 7.99 seconds / job (19 data points)
- RTX 3080: 2.45 seconds / job (15 data points)

I found that a number of OPNG units that were crunched on one of my cards had multiple wingmen with a singular card model that also crunched that same unit. While I can't control for RAM, CPU, Storage, etc, it may still allow us to see how cards stack up against each other. If the things that I can't control for don't majorly affect the numbers, then I can conclude from these comparisons:
-For every job that a Quadro M2000 crunches, a GTX 1070 can crunch 1.48 jobs (10 comps).
-For every job that a Quadro M2000 crunches, a GTX 1080ti can crunch 1.9 jobs (5 comps).
-For every job that a GT 1030 crunches, a 2070 Super can crunch 5.79 jobs (9 comps).
-For every job that a GTX 1650 crunches, a 2070 Super can crunch 2.4 jobs (4 comps).
-For every job a GTX 1050ti crunches, a RTX 3080 can crunch 4.66 jobs (2 comps).
-For every job a GTX 970 crunches, an RTX 3080 can crunch 3.01 jobs (2 comps).
-For every job a GTX 1650 crunches, an RTX 3080 can crunch 3.17 jobs (4 comps).
-For every job a Quadro K620 crunches, an RTX 3080 can crunch 15.31 jobs (4 comps).

Finally, when comparing cards using a site like TechPowerUp, I would caution against taking their rankings as gospel. Sure, general trends will hold - older cards will be slower and newer cards will be faster. But older cards may not be as slow or fast as those rankings imply. For example, TechPowerUp lists the GT1030 as being around 7% as powerful as an RTX 3080. With a 3080 taking an average of 2.45 seconds/job, TechPowerUp would predict that a GT1030 would take between 30-35 seconds/job. But with my 14 data points, a GT 1030 only takes an average of 15.83 seconds/job. And I have direct comparisons between a GT 1030 and a 2070 Super where the 2070 Super takes about 2.8 seconds/job (which is the overall average I've observed) and the GT 1030 takes 14.89 seconds/job.

I am also surprised at how well 1660s, 1650 supers, 1660 supers and 1660tis fare compared to powerful, older cards. They all seem to outperform GTX 1080s and 1080tis. The 1080ti has an average of around 6.2 seconds/job, but the newer cards seem to have an average time / job of around 4.3 - 5 seconds/job. And speaking of newer cards, they all seem to be closely grouped, despite TechPowerUp showing some significant drop-off in power. For example:
3080: 2.45 seconds/job (15 data points)
3070: 2.65 seconds/job (1 data point, ~9% slower instead of 23%)
2080 Super: 2.69 seconds/job (2 data points, ~9% slower, instead of 31%)
3060 ti: 2.79 seconds/job (1 data point, ~10% slower, instead of 30%)
2070 Super: 2.88 seconds/job (15 data points, ~15% slower instead of 39%)

What about you guys? Have you noticed any patterns? Have you observed different numbers? What do we think makes the most difference in speeding up the seconds/job? Memory bandwidth? CUDA cores / Stream Processors? Frequency? Next, I intend to start testing how modifying the <gpu_versions> affects these numbers.

----------------------------------------

----------------------------------------
[Edit 1 times, last edit by DrMason at Apr 7, 2021 7:41:05 AM]

[Apr 7, 2021 7:35:08 AM]

BladeD
Ace Cruncher
USA
Joined: Nov 17, 2004
Post Count: 28976
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

180 day badge for Help Cure Muscular Dystrophy

180 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

5 year badge for Outsmart Ebola Together

10 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

I think most users only see how long it take a WU to run.

----------------------------------------

MyCity

[Apr 7, 2021 8:05:58 AM]

gibbcorp
Advanced Cruncher
Joined: Nov 29, 2005
Post Count: 80
Status: Offline
Project Badges:

14 day badge for Discovering Dengue Drugs - Together

45 day badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

14 day badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

45 day badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

180 day badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

45 day badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

My ATI/AMD 6870 takes about 50 minutes to 1hour 10 per work unit :)
It doesn't seem to be using much or the GPU though. Does anyone know if i can do more than 1 work unit on this card? thanks

[Apr 7, 2021 1:38:01 PM]

Keith Myers
Senior Cruncher
USA
Joined: Apr 6, 2021
Post Count: 193
Status: Offline
Project Badges:

1 year badge for OpenPandemics - COVID-19


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

The application is poorly designed with lots of off dwell time. Gpu utilization is very low. The card spends 50% of its time off, then 100% usage to crunch the job, then off again to load the next job. And repeat.

You should be able to run 2X or 3x tasks per card to decrease the off dwell time and increase the overall gpu utilization.

----------------------------------------

A proud member of the OFA (Old Farts Association)

[Apr 8, 2021 2:53:52 PM]

goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:

14 day badge for Human Proteome Folding - Phase 2

20 year badge for Mapping Cancer Markers

5 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

I would suggest monitoring first, then adding 1 at a time until the average gpu utilization is at least 90%. I am guessing that it would be 2 units at a time on the AMD 6870.

Example from my RTX 2060M:
I used HWiNFO. The average utilization with 1 at a time is 26%. I increased the number of units 1 at a time. The optimal number of simultaneous work units seems to be 4.

However, since there is currently a shortage, I have it set to 1 at a time and leave folding@home running. So they take 3-5 minutes while sharing the gpu with folding@home.

Re:design
IIRC, it is just a bunch(20x?) of cpu jobs packaged together. That has the added downside of thrashing the drive. It was writing more than I am comfortable with for my consumer grade SSD when I had 4 units running at a time. So I setup ram drives and run boinc/folding@home/another temp file intensive app off of them.

----------------------------------------

[Apr 8, 2021 5:02:12 PM]

Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:

90 day badge for OpenPandemics - COVID-19


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

It was writing more than I am comfortable with for my consumer grade SSD when I had 4 units running at a time. So I setup ram drives and run boinc/folding@home/another temp file intensive app off of them.

this seems rather dangerous. since BOINC doesnt really treat these as temp files, what happens if you have a sudden power loss? wouldn't you then lose all WU's on that system?

----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti

[Apr 8, 2021 6:31:20 PM]

Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7806
Status: Offline
Project Badges:

14 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

10 year badge for Help Fight Childhood Cancer

45 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

200 year badge for Mapping Cancer Markers

20 year badge for Outsmart Ebola Together

100 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

this seems rather dangerous. since BOINC doesnt really treat these as temp files, what happens if you have a sudden power loss? wouldn't you then lose all WU's on that system?

Yes, you probably would unless you have your system on a UPS.. Do not fret, they will be re-issued to someone else and crunched.
Cheers

----------------------------------------

Sgt. Joe
*Minnesota Crunchers*

[Apr 9, 2021 3:19:11 AM]

goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

this seems rather dangerous. since BOINC doesnt really treat these as temp files, what happens if you have a sudden power loss? wouldn't you then lose all WU's on that system?

Yes, you probably would unless you have your system on a UPS.. Do not fret, they will be re-issued to someone else and crunched.
Cheers

Or as in this case, if it is a laptop. RTX 2060M = mobile, which has the same #of cores and close to the performance of the non mobile. MSI's cooling system is fantastic though. I can and do run it 24x7 without over heating or thermal throttling. The gpu fan never goes to max speed either. biggrin

----------------------------------------

[Apr 9, 2021 9:47:09 AM]

Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

not even just a power loss, but even a controlled shutdown without first moving all the files back to disk would erase all the files... seems like more of a hassle than it's worth.

----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti

[Apr 9, 2021 2:14:03 PM]

goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:


Re: Relative Efficiency of Various GPUs on OpenPandemics - GPU

not even just a power loss, but even a controlled shutdown without first moving all the files back to disk would erase all the files... seems like more of a hassle than it's worth.

Copying to / from disk is automated on startup, shutdown, restart. So no hassle other than initial setup. smile

I've noticed that checkpoints go a bit faster with boinc(wcg) and fah.

----------------------------------------

[Apr 9, 2021 5:59:32 PM]

[ ]