Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: HCC GPU Beta July 12 Tasks Restart |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 13
|
Author |
|
cw64
Advanced Cruncher Joined: Oct 6, 2007 Post Count: 120 Status: Offline Project Badges: |
Noticed a task getting to 99.415% and then restart back to 0%, resetting the elapsed timer as well. This happened several times untill I changed to "Use GPU Always". Seems they don't like being interrupted.
----------------------------------------Also the GPU core usage didn't pass 75% HD 7970 |
||
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: |
There is no checkpointing done for the GPU version. Most GPUs should run a workunit in less than 10 minutes. Therefore if a task is suspended or stopped for any reason it will restart from the beginning. The default is set to use GPU only when idle because the application will likely cause sluggishness if it is run on the primary display card while in use. One thing to look at in the stderr log of a successful run is the average kernel time which will be output towards the bottom. The lower this time the less likely you are to have an issue with running GPU while in use. My best card connected is a bit of a clunker and has an average kernel time of about .4 seconds and I notice performance problems most of the time.
Thanks, armstrdj |
||
|
cw64
Advanced Cruncher Joined: Oct 6, 2007 Post Count: 120 Status: Offline Project Badges: |
OK, but if the task is interrupted shouldn't it restart immediately? It didn't restart after being interrupted, it resumed, got to 99.415% then restarted.
----------------------------------------[Edit 1 times, last edit by cw64 at Jul 13, 2012 1:13:05 PM] |
||
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: |
Can you post the stderr log from this run?
Thanks, armstrdj |
||
|
cw64
Advanced Cruncher Joined: Oct 6, 2007 Post Count: 120 Status: Offline Project Badges: |
Not sure which one it would be. Two of the logs are much, much, longer than the others, so I guess it's them.
----------------------------------------http://pastebin.com/XW63rFqY http://pastebin.com/5FhKcXzu |
||
|
XSmeagolX
Senior Cruncher Joined: Nov 12, 2009 Post Count: 444 Status: Offline Project Badges: |
I had a same problem, i posted in the other thread...
----------------------------------------my stderr-log looks like this (hopefully it is the stderr-log.. :D ) <core_client_version>6.12.34</core_client_version> <![CDATA[ <stderr_txt> Commandline: projects/www.worldcommunitygrid.org/wcg_beta3_img_6.47_windows_intelx86__ati14_hcc1 X0000038271227200409250008.jp2 --device 0 INFO: gpu_type not found in init_data.xml. INFO: GPU device not specified in init_data.xml. Checking Commandline. Boinc requested ATI gpu device number0 Found compute platform Advanced Micro Devices, Inc. Selecting this platform CL_DEVICE_NAME: Cypress CL_DEVICE_VENDOR: Advanced Micro Devices, Inc. CL_DEVICE_VERSION: CAL 1.4.1720 (VM) CL_DEVICE_MAX_COMPUTE_UNITS: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 / 256 / 256 CL_DEVICE_MAX_WORK_GROUP_SIZE: 256 CL_DEVICE_MAX_CLOCK_FREQUENCY: 725 MHz CL_DEVICE_ADDRESS_BITS: 32 CL_DEVICE_MAX_MEM_ALLOC_SIZE: 512 MByte CL_DEVICE_GLOBAL_MEM_SIZE: 1024 MByte CL_DEVICE_ERROR_CORRECTION_SUPPORT: no CL_DEVICE_LOCAL_MEM_TYPE: local CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE CL_DEVICE_EXTENSIONS: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing Estimated kernel execution time = 0.41629 [sec] Starting analysis of ../../projects/www.worldcommunitygrid.org/BETA_X0000038271227200409250008_X0000038271227200409250008.jp2... Extracting GLCM features... Total kernel time: 151.587631 (1026 kernel executions) Total memory transfer time: 1.956145 Average kernel time: 0.147746 Min kernel time: 0.138377 (dx=3 dy=25 sample_dist=24 ) Max kernel time: 0.158475 dx=2 dy=1 sample_dist=1 Commandline: projects/www.worldcommunitygrid.org/wcg_beta3_img_6.47_windows_intelx86__ati14_hcc1 X0000038271227200409250008.jp2 --device 0 INFO: gpu_type not found in init_data.xml. INFO: GPU device not specified in init_data.xml. Checking Commandline. Boinc requested ATI gpu device number0 Found compute platform Advanced Micro Devices, Inc. Selecting this platform CL_DEVICE_NAME: Cypress CL_DEVICE_VENDOR: Advanced Micro Devices, Inc. CL_DEVICE_VERSION: CAL 1.4.1720 (VM) CL_DEVICE_MAX_COMPUTE_UNITS: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 / 256 / 256 CL_DEVICE_MAX_WORK_GROUP_SIZE: 256 CL_DEVICE_MAX_CLOCK_FREQUENCY: 725 MHz CL_DEVICE_ADDRESS_BITS: 32 CL_DEVICE_MAX_MEM_ALLOC_SIZE: 512 MByte CL_DEVICE_GLOBAL_MEM_SIZE: 1024 MByte CL_DEVICE_ERROR_CORRECTION_SUPPORT: no CL_DEVICE_LOCAL_MEM_TYPE: local CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE CL_DEVICE_EXTENSIONS: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing Estimated kernel execution time = 0.41670 [sec] Starting analysis of ../../projects/www.worldcommunitygrid.org/BETA_X0000038271227200409250008_X0000038271227200409250008.jp2... Extracting GLCM features... Total kernel time: 151.344650 (1026 kernel executions) Total memory transfer time: 1.882167 Average kernel time: 0.147509 Min kernel time: 0.138370 (dx=5 dy=25 sample_dist=24 ) Max kernel time: 0.158085 dx=1 dy=1 sample_dist=0 Total time for ../../projects/www.worldcommunitygrid.org/BETA_X0000038271227200409250008_X0000038271227200409250008.jp2: 773 seconds Finished Image #0, pctComplete = 1.000000 CPU time used = 104.817072 09:13:14 (5952): called boinc_finish </stderr_txt> ]]>
WCG-Team Captain of Team SETI.Germany
(official Partner of World Community Grid) |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
Can you post the stderr log from this run? Thanks, armstrdj Is @ 99.415% when the GPU part of the computation ends and the CPU finishes?
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
cw64
Advanced Cruncher Joined: Oct 6, 2007 Post Count: 120 Status: Offline Project Badges: |
Is @ 99.415% when the GPU part of the computation ends and the CPU finishes? Don't know. |
||
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: |
Yes 99.4 is when it is finished with the GPU and begins the post processing on the CPU. I did not see anything strange in the std err logs other than it did get interrupted many times and restart. There might be a bug in the % complete being displayed after the task is restarted causing it to look like it is continuing where it left off then resetting back to zero. I will look into this.
Thanks, armstrdj |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
Now that the server issue seems to be solved can we expect any more GPU betas soon?
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
|