Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 3
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1066 times and has 2 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
confused Intructional optimization

Hi Guys, I imagine this has already been put in, however is WCG setup with SSE2,3,4 3D Now intructions and the like. There are new versions of SSE's coming out with most new chips, and with everyone asking about optimizing for AMD or intel I figured best to just get right into the guts of it.

Also, with the whole GPU usage issue I know CUBA and Stream are languages you can code within but having a framework doesnt always make it easy to code for, have you considered working with ATI/Nvidia to create a similar intructional set, perhaps something between them so its just a single set for all GPU usage, if you had a set to work with it might simplify the complexities involved with GPU useage/offset.

To be quite frank I have not looked at the coding or whats involved with WCG, or if it would even benefit from SSE optimization but the forum is for suggestions, so Im putting my hand up. what do you experts think?

http://en.wikipedia.org/wiki/SSE2

I too am looking forward to my GPU being used as well as my CPU in the coming versions of WCG, hopefully with a bit of coding we can all get alot more work done.


Cheers,
Dougal
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 10, 2008 10:20:46 AM]
[Aug 10, 2008 10:20:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Intructional optimization

Hi Dougal,

Many (experts) have spoken before you on the optimization topic to include SSEx. The more optimized the less cross comparable results become from different computers, yes really when it gets down to folding and best energy calculations. Eventually a point will be reached that the lowest common denominator will no longer be supported and e.g. SSE2/3 becomes the new lowest minimum. Essentially though for each optimization, WCG has to add/debug 5/6/7 more science compiles per Operating System. At 3 commonly supported OSses that adds 15/18/21 applications to maintain and verify that the result is useful for the scientists.

Similarly follow discussion on optimization for 64 bit computers.

So picture, 32 bit + 64 bit versions for SSE and none SSE capable computers and you already got 4 versions for 1 science on 1 operating system. Why 64 bit also counts double though all have SSEx, is because a result in e.g. 32 bit without SSE gets pooled with 64 bit machines in e.g. HPF2 quorums. Every optimized compile requires a separate distribution and validation cycle, a monster support and management task at the scale WCG is running at.

They'll come, but then when is a big question mark.

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Aug 10, 2008 11:31:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Intructional optimization

Hi Dougal,
Some of my posts on this subject:
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=21304#174935
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=21201#173240
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=21201#173191
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=20445#168010
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=18794#164663
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=21229#176277
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=21524#179830

The application programs are supplied by the project scientists. We just board them to run on BOINC, we do not rewrite them, so any specialization to maximize the utility of SSE instructions would have to be done on their side. As my posts make clear, I am a fan of the BLAS (Basic Linear Algebra Subroutines) library. Programs organized around BLAS explicitly show the parallel areas of code, making it easy for the compiler to use SSE near-optimally.

Lawrence
[Aug 10, 2008 4:14:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread