Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Conquer Cancer Thread: GPU Optimisations |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 198
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
My first HD 7770 (on the E8400 CPU) is doing a work unit in 4 minutes 54 seconds. The second HD 7770 (on a Core i7-3770) is doing a work unit in 4 minutes 40 seconds. I don't think you are gaining much. Approximately by 30% more returned results per day ; ; andzgridPost#743 ; |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1316 Status: Offline Project Badges: |
All 24 went w/o problem in app. 1h10m (1.15 hours) each It's about how much CPU you want to spend.Spending 3 full cores of my not OC-ed i7 2600 and running 3 tasks concurrently on my not OC-ed HD 7770, I make 31 tasks in that 70 minutes. 5 cores left for other CPU-tasks. |
||
|
coolstream
Senior Cruncher SCOTLAND Joined: Nov 8, 2005 Post Count: 475 Status: Offline Project Badges: |
23 of 32 errored at the end of crunching (exceed maximum elapsed time which is probably set to 1.5 hours). App_info worked fine even if elapsed time for some tasks was app. 1.6 hours. Now testing 24. ... All 24 went w/o problem in app. 1h10m (1.15 hours) each (ATM 2 Valid, 22 PVal). Good setting for shrubbing during night, tomorrow will try 28 concurrent. Cheers and NI! Unless things have changed or I have misinterpreted the information, maximum elapsed time is more like 10 hours LINK https://secure.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=401797 Crunching in memory of my Mum PEGGY, cousin ROPPA and Aunt AUDREY. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
23 of 32 errored at the end of crunching (exceed maximum elapsed time which is probably set to 1.5 hours). App_info worked fine even if elapsed time for some tasks was app. 1.6 hours. Now testing 24. ... All 24 went w/o problem in app. 1h10m (1.15 hours) each (ATM 2 Valid, 22 PVal). Good setting for shrubbing during night, tomorrow will try 28 concurrent. Cheers and NI! Unless things have changed or I have misinterpreted the information, maximum elapsed time is more like 10 hours LINK https://secure.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=401797 It's probably a painful consequence of the single pool sourcing of work for GPU+CPU. The CPU versions have a [guess], 5 times original runtime estimate time out. If then assigned to a GPU with the same task header info... |
||
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges: |
My first HD 7770 (on the E8400 CPU) is doing a work unit in 4 minutes 54 seconds. The second HD 7770 (on a Core i7-3770) is doing a work unit in 4 minutes 40 seconds. I don't think you are gaining much. Approximately by 30% more returned results per day ; ; andzgridPost#743 ; If I would do this for money, I would definitely make thorough analyses of my expenses as well as my production. Since I am not, I am satisfy with rough analysis based on small-sample observations and some logic 1. Noise is really subjective category and what I perceive as "no increase", for somebody else could be significant. So I am gonna to agree with you, that this part of my claim might be misleading. Anyway, I am repeating, that according to me, there is not significant fan noise increase 2. When I am running 32 concurrent GPU tasks, CPU utilization is way south of 100% (thus less power consumption). If I run only one GPU task + 7 CPU tasks, the CPU utilization on those 7 cores is 99-100% (thus higher power consumption). That is why I assume (and claim that) the system power consumption is not increasing significantly (if at all) if I run 32x GPU WU's compared to 1x GPU + 7x CPU. Cheers and NI! Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 |
||
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges: |
All 24 went w/o problem in app. 1h10m (1.15 hours) each It's about how much CPU you want to spend.Spending 3 full cores of my not OC-ed i7 2600 and running 3 tasks concurrently on my not OC-ed HD 7770, I make 31 tasks in that 70 minutes. 5 cores left for other CPU-tasks. This is exactly what I expect from 7770: to be significantly more productive than my 7750 Cheers Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 |
||
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges: |
23 of 32 errored at the end of crunching (exceed maximum elapsed time which is probably set to 1.5 hours). App_info worked fine even if elapsed time for some tasks was app. 1.6 hours. Now testing 24. ... All 24 went w/o problem in app. 1h10m (1.15 hours) each (ATM 2 Valid, 22 PVal). Good setting for shrubbing during night, tomorrow will try 28 concurrent. Cheers and NI! Unless things have changed or I have misinterpreted the information, maximum elapsed time is more like 10 hours LINK https://secure.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=401797 Maybe it is because of app_config IDK. But all "24 concurrent crunching" tasks finished in less than evil 1.5 hours, thus w/o errors Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I know, I know - Nvidia cards are not as well suited the the 7xxx series.
BUT I've got 2 x GTX 690's on the crunch, and want to optimize their performance. They are in an SR-2 board with dual 5650 cpu's. Disabled SLI - as that was erroring everything out. GPU idle time was and still is too high - everything stops on GPU WU's far too often when at half way and finishing off each WU. 8 GPU WU's running simultaniously = stopping frequently. The point of my post is to ask if anyone knows how much CPU work is involved with each GPU WU. Is it maybe that the GPU part of the WU is finishing and having to wait for the CPU bit to catch up ?? I've got 16 CPU cores running at the same time. So I could cut them back and allocate 1.25 OR maybe 1.5 cores to each GPU WU. Something like this..... <app_config> <app> <name>hcc1</name> <max_concurrent>24</max_concurrent> <gpu_versions> <gpu_usage>.5</gpu_usage> <cpu_usage>1.25</cpu_usage> </gpu_versions> </app> </app_config> Currently got 24 cpu cores. 2 x WU's per gpu card with 1 cpu core each, leaving 16 cpu cores crunching non gpu. Any thoughts on this idea of more than 1 core per GPU WU ?? Been tried before ?, or is cpu portion small and I've just got to put up with constant idle gpu time. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi,
----------------------------------------Not sure you fully understand the use of these 2: <gpu_usage>.5</gpu_usage> <cpu_usage>1.25</cpu_usage> You've set the the GPU to run 2 at the time and CPU allocation *per* GPU job to 1.25. The highest any WCG project uses as being single threaded, is 1 CPU processor thread, so reduce that value to 1.00 at least. But, given the 5650 CPU's you can try <cpu_usage>0.5</cpu_usage> and see if throughput per hour increases, meaning if one task is in GPU phase, the other task can use the CPU, in alternation. Then later try <gpu_usage>.25</gpu_usage> meaning 4 GPU tasks, using combined 4 * .5 = 2 CPU cores together. Increment that in small steps to find the biggest hourly production, without them going error/invalid. edit: To add, with 2 such cards and CPUs, [16 processor threads], and <max_concurrent> of 24 you could theoretically pump the <gpu_usage>.5</gpu_usage> way down to .83333. Any spare instance what is not being run on the GPU will be running as CPU tasks, provided you've selected to also run HCC-CPU only tasks. The idea though is to experiment with the control values to find the maximum throughput maintaining valid results [Pending Validation is usually a good indicator that the task is OK] edit2: I've never read [can't remember seeing it] if having 2 cards of same or different capability and using .083333 is a control value *per* GPU card or if it works for the combination i.e. 12 for the 2 cards together. With that value a total max of 12 GPU tasks of hcc1 would then mean 6 concurrent on each. anyone in the know from hands on? edit3: made a oops. The .83333 value of course to read .083333, equivalent to 12 concurrent. [Edit 3 times, last edit by Former Member at Feb 9, 2013 4:18:26 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi, Not sure you fully understand the use of these 2: <gpu_usage>.5</gpu_usage> <cpu_usage>1.25</cpu_usage> You've set the the GPU to run 2 at the time and CPU allocation *per* GPU job to 1.25. The highest any WCG project uses as being single threaded, is 1 CPU processor thread, so reduce that value to 1.00 at least. But, given the 5650 CPU's you can try <cpu_usage>0.5</cpu_usage> and see if throughput per hour increases, meaning if one task is in GPU phase, the other task can use the CPU, in alternation. Then later try <gpu_usage>.25</gpu_usage> meaning 4 GPU tasks, using combined 4 * .5 = 2 CPU cores together. Increment that in small steps to find the biggest hourly production, without them going error/invalid. edit: To add, with 2 such cards and CPUs, [16 processor threads], and <max_concurrent> of 24 you could theoretically pump the <gpu_usage>.5</gpu_usage> way down to .83333. Any spare instance what is not being run on the GPU will be running as CPU tasks, provided you've selected to also run HCC-CPU only tasks. The idea though is to experiment with the control values to find the maximum throughput maintaining valid results [Pending Validation is usually a good indicator that the task is OK] edit2: I've never read [can't remember seeing it] if having 2 cards of same or different capability and using .83333 is a control value *per* GPU card or if it works for the combination i.e. 12 for the 2 cards together. With that value a total max of 12 GPU tasks of hcc1 would then mean 6 concurrent on each. anyone in the know from hands on? G'Day, Appreciate all the input I can get. (and info for others to work with also) Took me few days on and off to get working I am currently running this... <max_concurrent>24</max_concurrent> <gpu_usage>.5</gpu_usage> <cpu_usage>1</cpu_usage> The 1.25 was asking the question (not using) (think I understand - sort of) if throwing extra cpu to the gpu tasks would decrease the long inactive periods at the 49.707% & 99.707% points. Most often wait about a minute if EITHER WU from same card (2 out of 4 WU's) are at either those percentage spots - hope that makes sense. Are others having such long waits at half way and end of GPU WU's ?? I have not seen any posts about my concern (or is it normal on multi, I don't get long times on single GPU WU's) The 2 cpus have 24 threads in total, so I currently have 8 deidcated to gpu and 16 to cpu WU's. I will experiment a little and see how I go and report back with my findings. At least I found out that slil 690's (quad sli) needs to be disabled before too many error'd out. Input always welcome, Thanks SR-2 Ghetto Style !! Poor quality pics - sorry (ram fans @ 3500rpm & front rad fans @ 2100) |
||
|
|