Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 198
Posts: 198   Pages: 20   [ Previous Page | 10 11 12 13 14 15 16 17 18 19 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 30282 times and has 197 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Made an oops in my previous post... The .83333 value of course to read .083333, equivalent to 12 concurrent.
[Feb 9, 2013 4:20:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Well I tried .33 setting per gpu, but without any success.
See no point in trying anything higher.
Anything I'm missing ?? Or just an Nvidia thing.
Max I seem to be able get without errors on each 690 is 4 WU's crying
Pathetic compared to what the 79xx's are pumping out.
It was kinda expected though, as I've read in many places that 2 (maybe 3) concurrent on them is the limit.
Was hoping to at least get 3 running per gpu (6 per card) BUT alas not.
11 of the 12 WU's were running fine, BUT 1 kept erroring out, one every 30 seconds or so.
So had to drop back to .5 setting again. sad
Might keep one of the 690's for gaming rig, try to sell other and pick up a couple 7970's for the SR-2.
[Feb 9, 2013 5:42:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

NVidia does not excel at multithreaded OpenCL which HCC1-GPU is coded in, but think to have seen some running 5-6 concurrently, can't though remember which NV [consumer aimed] card that was.
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 9, 2013 5:52:00 PM]
[Feb 9, 2013 5:51:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Tried this today on a 4 core/thread i5 3570 with a GTX 570
<max_concurrent>5</max_concurrent>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.75</cpu_usage>
Doesn't add up I know (4.5 cpu threads) but works perfect.
Not gone over 2 WU's on the GPU as I'm fairly certain that errors will result.
The above results in 3 CPU task running and 2 GPU tasks running, even though total cpu total = 4.5 on a 4 thread cpu.
No errors on cpu or gpu tasks, even when both the gpu tasks are either at half way OR finishing stage (which is when (I think) the cpu part gets done)
Any thoughts as to why this "exploit on the system" is working ??
Reason I tried was to get the cpu tasks in cache to run without losing the 4 gpu tasks functionality - worked !!
Interesting I reckon.
[Feb 10, 2013 3:56:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Because HCC-GPU has discreet CPU and GPU computing phases, good chance is that while one GPU task is in CPU phase, the other is in GPU phase, which is why you see this behavior virtual over-allocation of .5 CPU cores. Whether <cpu_usage>.75</cpu_usage> or <cpu_usage>.5</cpu_usage> makes a difference you'd have to test. At any rate, when you fire up Task Manager you can monitor the total and per-core processor load, expecting that you have close to 100% going and when both GPU tasks are in CPU phase at same time, some competition for the cycles, which probably wont hurt, just then having a period where the GPU is underutilized. Experiment, experiment is the word. Having 100% valid with 2 running on the GPU or 75 valid with 3 running aka a setting such as

<max_concurrent>5</max_concurrent>
<gpu_usage>.333</gpu_usage>
<cpu_usage>.666</cpu_usage>

This lets 3 GPU tasks running who share 2 cores, and let 2 more HCC-CPU tasks run on the remaining threads, but how many valids does that generate at end of day? Or,

<max_concurrent>4</max_concurrent>
<gpu_usage>.333</gpu_usage>
<cpu_usage>1.00</cpu_usage>

Production could go up, the question being if the card can handle 3 GPU tasks when each is serviced by a full CPU core [but think you already found that not to be reliable].
[Feb 10, 2013 4:31:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bearcat
Master Cruncher
USA
Joined: Jan 6, 2007
Post Count: 2803
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Anyone have errors crunching more GPU than CPU threads? Would think even at .5 cpu that the cpu wu would take control of the whole thread until finished or will it time slice between wu's? Am crunching 8 GPU now on a 3770K (only HCC1) and with the hyperthreading threads not being a full core, would think the system would choke. Not sure though.
----------------------------------------
Crunching for humanity since 2007!

[Feb 10, 2013 6:24:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Yes I agree that what you say in first sentence is correct, but no matter how hard you try to spool the work to offset each other eventually the tasks have a cpu phase at the same time (unavoidable I think)
(that was and still is my problem with the 690's - wasted gpu time when at the cpu stage of the task - lucky 16 cpu tasks still running at the same time to tke the slack ;)
Which is why I was surprised that no task errors appeared.
Maybe the .333 will work on a 570.
Tried it (.333) on 6xx series but no good.
Different archtecture on then 570 may allow 3 concurrent GPU tasks.
Anyone had success with that ??
And yes, experimentation is the key to optimization.
Going to pick up a 7770 later today, what do people suggest as a starting point with GPU tasks ?
[Feb 10, 2013 6:45:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
branjo
Master Cruncher
Slovakia
Joined: Jun 29, 2012
Post Count: 1892
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

I am crunching 12 concurrent GPU WU's (one HD 7750 max OC'ed) on 3 CPU cores (i7-3770 non-OC'able). After less than 1 day, tasks order in the way that only 2 - 3 of them are in the CPU phase (rarely 4 of them), so they nicely utilize 3 CPU threads I have dedicated to them smile . Each task run app. 37 mins.

The maximum I have (successfully) tried with app_config was 24 concurrent GPU tasks on all 8 CPU threads (each task finished in circa 1.40 - 1.45 h).

If you are trying to optimize GPU tasks only, do not bother with <max_concurrent>N</max_concurrent>. This is only for setting max CPU tasks and you can delete it from app_config. You are controlling the number of GPU tasks via
<gpu_usage>X</gpu_usage>
<cpu_usage>Y</cpu_usage>

With 7770 I will recommend to go with 3 GPU tasks per CPU thread, max 24 concurrent GPU tasks per GPU. If you have 8-threads CPU, app_config should include:

<app>
<name>hcc1</name>
<gpu_versions>
<gpu_usage>.041666</gpu_usage>
<cpu_usage>.333333</cpu_usage>
</gpu_versions>
</app>

Since 7770 is much more powerful than my 7750 (1.28 TeraFLOPS SP compute power vs. 819 GigaFLOPS SP for my 7750), I would definitely try 32 concurrent GPU tasks with:

<app>
<name>hcc1</name>
<gpu_versions>
<gpu_usage>.03125</gpu_usage>
<cpu_usage>.25</cpu_usage>
</gpu_versions>
</app>

Cheers peace
----------------------------------------

Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006

----------------------------------------
[Edit 2 times, last edit by branjo at Feb 10, 2013 7:48:05 PM]
[Feb 10, 2013 7:14:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bearcat
Master Cruncher
USA
Joined: Jan 6, 2007
Post Count: 2803
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

branjo, not sure if I am reading your post correctly. Is it taking 1 hour 45 minutes per wu when you were crunching 24 concurrently? With 8 gpu wu's having there own cpu thread, mine are averaging about 8 minutes per on a 7870 ghz edition with a 3770k processor.
----------------------------------------
Crunching for humanity since 2007!

[Feb 11, 2013 3:48:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1313
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: GPU Optimisations

Since 7770 is much more powerful than my 7750 (1.28 TeraFLOPS SP compute power vs. 819 GigaFLOPS SP for my 7750), I would definitely try 32 concurrent GPU tasks with:

<app>
<name>hcc1</name>
<gpu_versions>
<gpu_usage>.03125</gpu_usage>
<cpu_usage>.25</cpu_usage>
</gpu_versions>
</app>
I gave 32 tasks concurrently on my 7770 a try on my 8-threaded i7 2600 with 16GB RAM.
If anything would have been fine, they are running for about 80 minutes with of course all 8 threads dedicated.
But it was a pity, that 13 out of that 32 almost ready tasks restarted at the 50% checkpoint - start of the second image.

So I stick to 3 tasks on 2 threads and 6 threads free for other CPU-tasks.
Elapsed time of 3 HCC GPU's concurrently in that config about 8.5 minutes.
----------------------------------------

[Feb 11, 2013 12:10:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 198   Pages: 20   [ Previous Page | 10 11 12 13 14 15 16 17 18 19 | Next Page ]
[ Jump to Last Post ]
Post new Thread