Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: HCC1 Beta Test for Linux GPU (Issues Thread) |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 49
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This WCG first Linux GPU-WU-beta caught me unprepared... so my Ubu12.10 is not set up. I can't even get the machine off the ground: BOINC_v7.0.27 doesn't even have an entry for GPU under the Activity menu. Need help there: How do I install AMD's latest Linux driver and next get the machine/BOINC to recognize the AMD card?
----------------------------------------; ; andzgridPost#715 ; [Edit 1 times, last edit by Former Member at Dec 5, 2012 8:36:32 AM] |
||
|
sgoll
Advanced Cruncher Joined: Oct 24, 2006 Post Count: 87 Status: Offline Project Badges: |
I got:
----------------------------------------05-Dec-2012 12:33:10 [World Community Grid] The Help Conquer Cancer NVIDIA GPU application cannot run on your graphics card The Ion chipset is like the Geforce 9400M graphic card and this one is listed in the https://secure.worldcommunitygrid.org/help/viewTopic.do?shortName=GPU#610 thread. But I think that for testing and development also slower cards may be good, especially when doing tests for new platform. Or is there another reason for not getting work? 11-Oct-2012 17:10:39 [---] NVIDIA GPU 0: ION (driver version unknown, CUDA version 4020, compute capability 1.1, 509MB, 35 GFLOPS peak) 27-Jun-2012 17:04:30 [---] NVIDIA GPU 0: GeForce GT 320 (driver version unknown, CUDA version 4020, compute capability 1.2, 1023MB, 187 GFLOPS peak) 16-Nov-2012 08:54:46 [---] NVIDIA GPU 0: GeForce GT 430 (driver version unknown, CUDA version 5000, compute capability 2.1, 2047MB, 90 GFLOPS peak) Maybe it's the compute capability? My GT320 and the GT430 are getting beta workunits. Stephan PS: On my GT430 I got: Estimated kernel execution time = 2.48244 [sec] ERROR: Kernel execution time estimate too high, exiting. 12:55:34 (17000): called boinc_finish PPS: Now I'm getting: <rsc_fpops_est>25411773615583.000000</rsc_fpops_est> <rsc_fpops_bound>508235472311660.000000</rsc_fpops_bound> <rsc_memory_bound>78643200.000000</rsc_memory_bound> <rsc_disk_bound>50000000.000000</rsc_disk_bound> <computation_deadline>1354843616.000000</computation_deadline> </app_init_data> INFO: gpu_type not found in init_data.xml. INFO: gpu_device_num set in init_data.xml to 0 Boinc requested NVIDIA gpu device number 0 Unzipping input images ../../projects/www.worldcommunitygrid.org/4fcb8b86dc8f6139b8f3c2a5744cf4f6.zip Processing jobdescription Number of Images defined in image list is 2 ERROR: VerifyGPU.cpp:65 Unknown 14:27:04 (22145): called boinc_finish </stderr_txt> ]]> I will switch back to CUDA 4.2 and see what will happen ... On the GT320 it looks like this: <rsc_fpops_est>25411773615583.000000</rsc_fpops_est> <rsc_fpops_bound>508235472311660.000000</rsc_fpops_bound> <rsc_memory_bound>78643200.000000</rsc_memory_bound> <rsc_disk_bound>50000000.000000</rsc_disk_bound> <computation_deadline>1354842277.000000</computation_deadline> </app_init_data> INFO: gpu_type not found in init_data.xml. INFO: gpu_device_num set in init_data.xml to 0 Boinc requested NVIDIA gpu device number 0 Unzipping input images ../../projects/www.worldcommunitygrid.org/c2472cbd6f30acaad805e9dca3e426df.zip Processing jobdescription Number of Images defined in image list is 2 Found compute platform NVIDIA Corporation Selecting this platform CL_DEVICE_NAME: GeForce GT 320 CL_DEVICE_VENDOR: NVIDIA Corporation CL_DEVICE_VERSION: 295.71 CL_DEVICE_MAX_COMPUTE_UNITS: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 CL_DEVICE_MAX_WORK_ITEM_SIZES: 512 / 512 / 64 CL_DEVICE_MAX_WORK_GROUP_SIZE: 512 CL_DEVICE_MAX_CLOCK_FREQUENCY: 1302 MHz CL_DEVICE_ADDRESS_BITS: 32 CL_DEVICE_MAX_MEM_ALLOC_SIZE: 255 MByte CL_DEVICE_GLOBAL_MEM_SIZE: 1023 MByte CL_DEVICE_ERROR_CORRECTION_SUPPORT: no CL_DEVICE_LOCAL_MEM_TYPE: local CL_DEVICE_LOCAL_MEM_SIZE: 16 KByte CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_OUT_OF_ORDER_EXEC_MODE_ENABLE CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE CL_DEVICE_EXTENSIONS: cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics CL_DEVICE_COMPUTE_CAPABILITY_NV: 1.2 CL_DEVICE_REGISTERS_PER_BLOCK_NV: 16384 CL_DEVICE_WARP_SIZE_NV: 32 CL_DEVICE_GPU_OVERLAP_NV: CL_TRUE CL_DEVICE_KERNEL_EXEC_TIMEOUT_NV: CL_FALSE CL_DEVICE_INTEGRATED_MEMORY_NV: CL_FALSE Estimated kernel execution time = 1.39454 [sec] Starting analysis of X0900075151391200610031534.jp2... Extracting GLCM features... Total kernel time: 905.932495 (1026 kernel executions) Total memory transfer time: 2.928820 Average kernel time: 0.882975 Min kernel time: 0.751814 (dx=25 dy=0 sample_dist=24 ) Max kernel time: 1.073066 dx=2 dy=1 sample_dist=1 INFO: GPU calculations complete. Total time for X0900075151391200610031534.jp2: 959 seconds Finished Image #0, pctComplete = 0.500000 Starting analysis of X0900075151392200610031534.jp2... Extracting GLCM features... Total kernel time: 952.409607 (1026 kernel executions) Total memory transfer time: 5.806847 Average kernel time: 0.928274 Min kernel time: 0.808257 (dx=25 dy=0 sample_dist=24 ) Max kernel time: 1.093689 dx=2 dy=1 sample_dist=1 INFO: GPU calculations complete. Total time for X0900075151392200610031534.jp2: 1005 seconds Finished Image #1, pctComplete = 1.000000 CPU time used = 1958.970000 14:37:27 (23457): called boinc_finish </stderr_txt> ]]> [Edit 5 times, last edit by sgoll at Dec 5, 2012 1:42:35 PM] |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
My AMD 7750 has returned 120 of these overnight, with most valid, some PV, and 1 error.
----------------------------------------On the other hand, all the work units returned by my AMD E-350 machines have turned out Invalid . This shouldn't be the case, as these chips run POEM ATI units just fine. POEM is also single-precision OpenCL. Here's the log from one such invalid WU: Result Log Result Name: BETA_ X0930075140235200609061427-- 201212042006530_ 1 <core_client_version>7.0.27</core_client_version> <![CDATA[ <stderr_txt> Commandline: ../../projects/www.worldcommunitygrid.org/wcg_beta3_img_7.08_x86_64-pc-linux-gnu__ati_hcc1 --zipfile X0930075140235200609061427.zip --imagelist images.txt --device 0 <app_init_data> <major_version>7</major_version> <minor_version>0</minor_version> <release>27</release> <app_version>708</app_version> <app_name>beta3</app_name> <project_preferences> <color_scheme>Tahiti Sunset</color_scheme> <max_frames_sec>7</max_frames_sec> <max_gfx_cpu_pct>5.0</max_gfx_cpu_pct> </project_preferences> <project_dir>/var/lib/boinc-client/projects/www.worldcommunitygrid.org</project_dir> <boinc_dir>/var/lib/boinc-client</boinc_dir> <wu_name>BETA_X0930075140235200609061427--201212042006530</wu_name> <result_name>BETA_X0930075140235200609061427--201212042006530_1</result_name> <shm_key>-1</shm_key> <slot>0</slot> <wu_cpu_time>0.000000</wu_cpu_time> <starting_elapsed_time>0.000000</starting_elapsed_time> <using_sandbox>0</using_sandbox> <user_total_credit>2983012.950643</user_total_credit> <user_expavg_credit>8399.793913</user_expavg_credit> <host_total_credit>43734.118943</host_total_credit> <host_expavg_credit>140.173468</host_expavg_credit> <resource_share_fraction>1.000000</resource_share_fraction> <checkpoint_period>60.000000</checkpoint_period> <fraction_done_start>0.000000</fraction_done_start> <fraction_done_end>1.000000</fraction_done_end> <gpu_type>ATI</gpu_type> <gpu_device_num>0</gpu_device_num> <gpu_opencl_dev_index>0</gpu_opencl_dev_index> <ncpus>1.000000</ncpus> <rsc_fpops_est>25411773615583.000000</rsc_fpops_est> <rsc_fpops_bound>508235472311660.000000</rsc_fpops_bound> <rsc_memory_bound>78643200.000000</rsc_memory_bound> <rsc_disk_bound>50000000.000000</rsc_disk_bound> <computation_deadline>1354804033.000000</computation_deadline> </app_init_data> INFO: gpu_type set in init_data.xml to ATI INFO: gpu_device_num set in init_data.xml to 0 Boinc requested ATI gpu device number0 Unzipping input images ../../projects/www.worldcommunitygrid.org/3b7797b2ec459b54084bb8ab2a5d186a.zip Processing jobdescription Number of Images defined in image list is 2 Found compute platform Advanced Micro Devices, Inc. Selecting this platform CL_DEVICE_NAME: Loveland CL_DEVICE_VENDOR: Advanced Micro Devices, Inc. CL_DEVICE_VERSION: CAL 1.4.1607 CL_DEVICE_MAX_COMPUTE_UNITS: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 / 256 / 256 CL_DEVICE_MAX_WORK_GROUP_SIZE: 256 CL_DEVICE_MAX_CLOCK_FREQUENCY: 492 MHz CL_DEVICE_ADDRESS_BITS: 32 CL_DEVICE_MAX_MEM_ALLOC_SIZE: 128 MByte CL_DEVICE_GLOBAL_MEM_SIZE: 192 MByte CL_DEVICE_ERROR_CORRECTION_SUPPORT: no CL_DEVICE_LOCAL_MEM_TYPE: local CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE CL_DEVICE_EXTENSIONS: cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt Estimated kernel execution time = 0.42987 [sec] Starting analysis of X0930075140235200609061427.jp2... Extracting GLCM features... Total kernel time: 987.345581 (1026 kernel executions) Total memory transfer time: 1.589869 Average kernel time: 0.962325 Min kernel time: 0.892054 (dx=11 dy=23 sample_dist=24 ) Max kernel time: 1.016697 dx=1 dy=1 sample_dist=0 INFO: GPU calculations complete. Total time for X0930075140235200609061427.jp2: 2198 seconds Finished Image #0, pctComplete = 0.500000 Starting analysis of X0930075140236200609061427.jp2... Extracting GLCM features... Total kernel time: 978.517395 (1026 kernel executions) Total memory transfer time: 3.177762 Average kernel time: 0.953721 Min kernel time: 0.894309 (dx=11 dy=23 sample_dist=24 ) Max kernel time: 1.008617 dx=1 dy=1 sample_dist=0 INFO: GPU calculations complete. Total time for X0930075140236200609061427.jp2: 2185 seconds Finished Image #1, pctComplete = 1.000000 CPU time used = 293.142319 01:03:47 (3384): called boinc_finish </stderr_txt> ]]> |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Each GPU task starts with a mini-test to see if there is reasonable performance on the system, which of course is a combination of CPU+GPU+OS+MomentaryLoad. When the mini-test does not give a reasonable kernel duration test, the task is aborted. My GT220 had a KT of 2.6 seconds and got rejected, so I've hooked it to a profile which has GPU work deselected for WCG.
Personally, to get a test optimal [for medium/lower end cards], the system should be idle before a task starts, so setting e.g. a few minutes delay after end of user input, would allow the system to come to quite and then do the best test possible **. What I'm saying is, the system may do the test when idle such that it wont fail. ** Similarly the CPU benchmark is performed when BOINC computing is temporarily suspended [30 seconds]. |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: |
This WCG first Linux GPU-WU-beta caught me unprepared... so my Ubu12.10 is not set up. I can't even get the machine off the ground: BOINC_v7.0.27 doesn't even have an entry for GPU under the Activity menu. Need help there: How do I install AMD's latest Linux driver and next get the machine/BOINC to recognize the AMD card? ; ; andzgridPost#715 ; To install the AMD driver, you need to: Go to Software Sources Choose Additional Drivers Install fglrx Reboot the machine After rebooting, manually restart BOINC. Do this by getting a terminal window and entering sudo /etc/init.d/boinc-client restart You will be prompted to enter your password. BOINC should now recognize your GPU and that it is OpenCL capable. I have heard of people having trouble if they installed "fglrx-updates" instead of "fglrx." Haven't tried it myself; I have had good luck with Xubuntu 12.10 and fglrx. You will have to manually restart BOINC every time you reboot the machine. If you still have trouble with BOINC not recognizing your GPU, check this link: http://boinc.berkeley.edu/dev/forum_thread.php?id=6307&sort=6 |
||
|
pvh513
Senior Cruncher Joined: Feb 26, 2011 Post Count: 260 Status: Offline Project Badges: |
I received 228 WUs so far on 3 cards (nVidia GTX 460, GTX 550 Ti and AMD Radeon HD 6950). 204 are valid, 11 pending and 12 in progress. 1 unit was invalid (BETA_X0900075151083200610031539--201212042007340). Apart from the one invalid unit everything seems to be working very smoothly.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If you were the reference tester, 1 out of 228 invalid, no errors this is way below the tech intervention level of 5%... making your test not even 0.5%, ready for production ;>)
|
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: |
Ummm ... With the Dashboard showing only 53 days of crunching left in the HCC project, why start a GPU version under Linux now?
Or is there some information about the duration of HCC that hasn't been released to us mortals yet? In my case, such info would determine whether I'll buy some suitable graphics hardware and join the fray. For others who are already running the project, such info would enable informed decisions to be made about graphics hardware upgrades. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Regardless how long it will take for HCC to completion, WCG attaches great value to bringing MAC and Linux to production while we have a science to do this with, as a "gain knowledge and experience" exercise, regardless the remaining duration. Already, the WCG experience has led to changes/implementations that Berkeley developers are working on.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello kateiacy,
----------------------------------------Thanks for your [Dec 5, 2012 1:57:51 PM] post. I'll follow the steps there now, see how that goes, and post some feedback. ; ; andzgridPost#716 ; edit1_2012.12.05We.1538utc: Currently downloading 'fglrx'. Funny, there is indicated 'Applying changes' when it is more like 'Downloading files'... Ok, the download and the apply seems to be done as I now see the radio-button on 'Using Video driver for the AMD graphics accelerators from fglrx (proprietary)' where the radio-button used to be at the 'Using Video driver for the AMD graphics accelerators from fglrx-updates (proprietary)'... Rebooting now... ; edit2_2012.12.05We.1626utc: I had a flawless AMD/ATI driver install . I now see GPU-use-mode controls under the Activity menu where there was once none. The event log now has a description of the AMD/ATI GPU card as follows: -- ATI GPU 0: Juniper (CAL version 1.4.1741, 1024MB, 804MB available, 2720 GFLOPS peak) -- OpenCL: ATI GPU 0: Juniper (driver version 1016.4, device version OpenCL 1.2 AMD-APP (1016.4), 1024MB, 804MB available). The Catalyst UI, however, is something I'm not sure of if it's the latest for Ubuntu-Linux. But that'll be for another day. For now, thanks for the guidance, kateiacy! ; [Edit 2 times, last edit by Former Member at Dec 5, 2012 4:26:00 PM] |
||
|
|