Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 15
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3503 times and has 14 replies Next Thread
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
AMD Platform Optimization - please read for all developers and gamers

AMD Platform Optimization

please read for all developers

https://community.amd.com/thread/213045

Processor: 8 AuthenticAMD AMD FX-8320E Eight-Core Processor [Family 21 Model 2 Stepping 0]

Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c syscall nx lm avx sse4a osvw xop wdt fma4 topx page1gb rdtscp bmi1

Memory: 15.95 GB physical, 18.82 GB virtual

http://esa-space.blogspot.com/2017/ boinc optimization thoughts...

http://32ipi028l5q82yhj72224m8j.wpengine.netd...imizing-For-AMD-Ryzen.pdf

****
"
AMD Software Optimization Guide for Ryzen
chromatix chromatix Apr 22, 2017 4:46 PM (in response to tagoo)
A slide deck on the subject got leaked a while ago. The executive summary, as far as I can remember it:

Don't use non-temporal accesses (unless you REALLY know what you're doing, and you probably don't).
Don't use manual prefetching. The automatic prefetchers work better, and don't consume decode bandwidth or op-cache space.
Organise your data in memory so that the automatic prefetchers are maximally effective. This may involve using structs-of-arrays instead of arrays-of-structs, or vice versa, depending on access patterns.
Minimise data movement between CCXes, as the bandwidth available between them is significantly less than within them. This may involve careful choice of worker-thread count and affinity.
SMT is new to AMD, but works similarly to Intel's HT and has similar tradeoffs. Ensure any thread affinity settings account for this.

Aside from the above, it is implied that Ryzen mostly responds well to code optimised for Intel CPUs. If the older AMD-specific ISA extensions are avoided, code optimised for older AMD CPUs should also run well, as long as the above guidelines are also accounted for.

Interestingly, adjusting existing code for the above guidelines seems to have a small net positive effect on Intel CPUs as well. This may obviate the need to have separate Intel and AMD code paths.

Agner Fog says he's nearly finished adding his analysis of Ryzen to his own famous optimisation manuals. This will no doubt be very illuminating."
[Apr 27, 2017 6:38:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

[Apr 30, 2017 12:17:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

for further links and thought visit : http://bit.ly/HPC-Dev

PC/Mac/Windows/Linux/Android

https://www.khronos.org/news/events/2016-isc-high-performance

https://www.khronos.org/assets/uploads/develo...IGGRAPH%20BOF%20Aug08.pdf HPC Report

https://www.microsoft.com/en-us/download/details.aspx?id=54507 Microsoft HPC Pack 2016 including linux

https://technet.microsoft.com/en-us/library/cc514029(v=ws.11).aspx all HPC Packs 2016,2012 to 2008 info and download

https://msdn.microsoft.com/en-us/library/ff976568.aspx Microsoft High Performance Computing for Developers - info and downloads

**
OpenVX for high performance Computing : Multi platform spec

https://www.khronos.org/news/tags/tag/OpenVX

https://www.khronos.org/news/press/openvx-1.2...on-power-efficient-vision
[May 6, 2017 3:11:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

for a comparison of GFlops/Mips throughput of various Boinc Tasks ..

here we show the relevance of the code or function used ... AVX for example is multi threaded ! and so is the FPU pipeline of the AMD FX & Ryzen processor.....

http://bit.ly/HPCImpact (original non edited photos ...)

and set 2 (newer) http://bit.ly/2HPCImpact ....

see the work throughput GFlops compared to code efficiency per task !

sometimes entropy is needed to for-fill the task one would imagine (for example on android) http://bit.ly/tRNG-Dev

the improvement of the boinc and worldcommunitygrid projects has been observed, noted and one feels improved upon, ..

further improvement should be implemented as soon as possible; To improve work versus output efficiency.

thank you kindly programmers/Workers & scientists for your perseverance & effort.

RS
[May 13, 2017 10:47:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

High Performance Computing best practice http://bit.ly/HPCBestPrac
[May 24, 2017 10:27:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

https://www.youtube.com/watch?v=mLQGXlxemlg - Optimizing HPC Service Delivery by a life time super computing tec
[May 31, 2017 3:16:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

CPU Optimisation - utility and function.

http://gpuopen.com/compute-product/codexl/ - CodeXL is a code efficiency analyser optimiser debugger for GPU and CPU and system.
https://github.com/GPUOpen-Tools/CodeXL/releases/latest
[Jun 3, 2017 12:11:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

[url]http://bit.ly/CoXLPhoto[/url] - CodeXL in action photos

[url]http://support.amd.com/TechDocs/24593.pdf[/url] - AMD64 Architecture Programmer’s Manual Volume 2: System Programming
[Jun 3, 2017 6:46:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

http://www.noamross.net/blog/2013/4/25/faster-talk.html - speeding up code a guide - profiling and bench-marking.

http://www.pgroup.com/doc/pgi17ug-x64.pdf - PGI Compiler guide

http://www.agner.org/optimize/ - code optimisation for all programmers on X86,X86-64bit and some others.. this is a terrific resource !
[Jun 3, 2017 9:08:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
QuantumEthos
Senior Cruncher
Joined: Jul 2, 2011
Post Count: 336
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: AMD Platform Optimization - please read for all developers and gamers

http://hgpu.org information; interesting learning & source

http://dspace.princeton.edu/jspui/bitstream/8...princeton_0181D_11168.pdf Optimization for parallel computing information.

https://arxiv.org/pdf/1705.05249 - CLBlast: A Tuned OpenCL BLAS Library demonstration.
[Jun 17, 2017 10:54:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 15   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread