Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 135
Posts: 135   Pages: 14   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 30230 times and has 134 replies Next Thread
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

Edit= or are you counting successful tasks as ones that ran to completion? without erroring out


Yes - that report was purely focused on if they ran to completion or not. We have work to-do on valid vs invalid.
[Apr 17, 2012 1:57:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

Thanks. again, my 680 will report 13 invalid tasks. The only difference I (emphasis on I) could see was a different driver. Mine was 301.10 and his was 301.25, so I just switched to this new driver.

Results don't look too bad overall though (as far as running to completion is concerned) Keep up the great work!!!
----------------------------------------
[Edit 1 times, last edit by Former Member at Apr 17, 2012 2:03:11 PM]
[Apr 17, 2012 2:02:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

More information for results that were valid:

num_res = number of results
num_hts = number of hosts
avg_cpu = average cpu time
avg_elp = average elapsed time
video_card = type of video card

num_res	num_hts	avg_cpu	avg_elp	video_card
1168 18 164.0 228.0 CUDA|GeForceGTX580
311 2 166.0 168.0 CUDA|GeForceGTX590
145 1 202.0 238.0 CUDA|GeForceGTX680


Odd that the 590 is faster than the 580. It should be slower per-task, on average. I can't believe that the average of 590s out there are OC faster than the the average of 580s.

Also, looks like the 680 is slower than the 580. Does this app use double precision?

http://www.tomshardware.com/reviews/geforce-g...ew-benchmark,3161-14.html
Nvidia limits 64-bit double-precision math to 1/24 of single-precision, protecting its more compute-oriented cards from being displaced by purpose-built gamer boards. The result is that GeForce GTX 680 underperforms GeForce GTX 590, 580 and to a much direr degree, the three competing boards from AMD.

----------------------------------------
[Edit 1 times, last edit by zombie67 [MM] at Apr 17, 2012 2:11:22 PM]
[Apr 17, 2012 2:10:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

In regards to dp, that's the thing I could think of, albeit very little apparently. FP32 cuda projects are showing a 25% inc. Over 580 and 44% over 570.

On my I5@4.2 they were running at speed as posted on average.

EDIT: the statement that it LIMITS it's FP64 precision is also SOMEWHAT incorrect. It doesn't limit it like it does on other series. It's that the Kepler series only has 8 FP64 cores on the board. They're actually not even added in the core count. They're completely seperate.

The other change coming from GF114 is the mysterious block #15, the CUDA FP64 block. In order to conserve die space while still offering FP64 capabilities on GF114, NVIDIA only made one of the three CUDA core blocks FP64 capable. In turn that block of CUDA cores could execute FP64 instructions at a rate of ¼ FP32 performance, which gave the SM a total FP64 throughput rate of 1/12th FP32. In GK104 none of the regular CUDA core blocks are FP64 capable; in its place we have what we’re calling the CUDA FP64 block.

"The CUDA FP64 block contains 8 special CUDA cores that are not part of the general CUDA core count and are not in any of NVIDIA’s diagrams. These CUDA cores can only do and are only used for FP64 math. What's more, the CUDA FP64 block has a very special execution rate: 1/1 FP32. With only 8 CUDA cores in this block it takes NVIDIA 4 cycles to execute a whole warp, but each quarter of the warp is done at full speed as opposed to ½, ¼, or any other fractional speed that previous architectures have operated at. Altogether GK104’s FP64 performance is very low at only 1/24 FP32 (1/6 * ¼), but the mere existence of the CUDA FP64 block is quite interesting because it’s the very first time we’ve seen 1/1 FP32 execution speed. Big Kepler may not end up resembling GK104, but if it does then it may be an extremely potent FP64 processor if it’s built out of CUDA FP64 blocks."
----------------------------------------
[Edit 3 times, last edit by Former Member at Apr 17, 2012 2:29:13 PM]
[Apr 17, 2012 2:14:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BSD
Senior Cruncher
Joined: Apr 27, 2011
Post Count: 224
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

Mine are all invalid except one inconclusive dangler.

Curious, may just be a coincidence. I noticed that mostly the wingman pair that have the valid results have the newer BOINC 7.0.x version clients while I have the older 6.12.34 version. I'm guessing those bleeding edge folks have newer/better GPU graphics cards.
[Apr 17, 2012 2:57:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

!
----------------------------------------
[Edit 1 times, last edit by skgiven at Jul 18, 2012 9:15:51 PM]
[Apr 17, 2012 3:01:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

@skgiven. I also find it strange that GPU time isn't factored in. I kept looking at my results and seeing:

total kernal time= 165.36
CPU time= 214.875
Total time 216

EDIT: or how about this one from a 560Ti (not mine)

Total kernel time: 2564954521600.000000 (1026 kernel executions)
Total memory transfer time: 968573714432.000000
Average kernel time: 2499955674.074074
Min kernel time: 0.000000 (dx=9 dy=3 sample_dist=8 )
Max kernel time: 18446743552.000000 dx=4 dy=2 sample_dist=3
Total time for ../../projects/www.worldcommunitygrid.org/BETA_X0000130861135201112120929_X0000130861135201112120929.jp2: 474 seconds
Finished Image #0, pctComplete = 1.000000

Valid btw
CPU time used = 484.156250
----------------------------------------
[Edit 2 times, last edit by Former Member at Apr 17, 2012 3:17:39 PM]
[Apr 17, 2012 3:15:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

@knreed- FWIW 15 of the 31 errors that were reported for the GTX460 hosts came from 1 of my machines. I took that machine off line long enough to troubleshoot a few things. I put it back online after reinstalling the drivers and rebooting and it completed 20 tasks w/o any errors. Maybe this can help some others that had error problems.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Apr 17, 2012 3:26:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

This round caught me a little off guard...saw a teammate's message on XS and quickly switched my main system's profile to try & catch a few WUs.

Got 74 and all completed/validated. smile

System is Win7 x64, BOINC 6.10.60, i7 920 HT on, 2xGTX570s/285.62 driver, 24GB RAM.

First few WUs completed/validated successfully, but GPU utilization was fluctuating wildly. I realized I had left my system in SLI @ 5040x1050 with 100% of CPUs crunching. I disabled SLI and freed a CPU thread, then saw no fluctuation in GPU utilization. Probably sped things up, but looking through my results it's hard to say.

Two hiccups occured:
1) nVidia driver crashed/recovered when I suspended BOINC processing to turn off SLI (all runnng tasks resumed OK).
2) System crashed & rebooted when I suspended BOINC processing to free a CPU thread (all runnng tasks resumed OK).

Only app running during the crashes was firefox with hardware acceleration off.

I also experienced the lag in UI that others are reporting.

Awesome job...keep up the good work!

applause applause applause
[Apr 17, 2012 4:05:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC1 GPU Testing - Round 4

knreed,
These Beta's are working Great for me ! Thanks for the effort.
I am quite ready to start running them fulltime when ever ya'll are wink

Note:
I noticed there is only one version of the 560 Ti listed.
You may want to note the 448 version is a very different card.
They really should have called it 565 Ti

While the GTX 560Ti is based on the GF114 graphics core , the 448 Core 560Ti is based on the GF110 architecture shared by the GTX 580 and GTX 570
-----------------------------------------------------
The 560 Ti is identified in the result log as
CL_DEVICE_GLOBAL_MEM_SIZE: 993 MByte
CL_DEVICE_COMPUTE_CAPABILITY_NV: 2.1
-----------------------------------------------------
The 560 Ti 448 is identified in the result log as
CL_DEVICE_GLOBAL_MEM_SIZE: 1280 MByte
CL_DEVICE_COMPUTE_CAPABILITY_NV: 2.0
-----------------------------------------------------
and at least in the case of my EVGA Classified version power is handled very differenty...
"6-phase digital PWM (compared to four from the reference design)"
----------------------------------------
[Edit 1 times, last edit by Former Member at Apr 18, 2012 12:35:19 AM]
[Apr 17, 2012 4:47:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 135   Pages: 14   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread