Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Help Conquer Cancer Thread: Two Frozen Applications: hcc1 7.05 (ati_hcc1) |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 3
|
Author |
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8976 Status: Offline Project Badges: |
Two WU tied up two GPU processors for 6.5 hours each. I aborted both jobs after collecting data which follows.
----------------------------------------The applications are flowing again and finishing in about 8 minutes. =BOINC MGR Event Log Computer: Coltrane 3/16/2013 3:11:55 PM | | Starting BOINC client version 7.0.28 for windows_x86_64 3/16/2013 3:11:55 PM | | log flags: file_xfer, sched_ops, task 3/16/2013 3:11:55 PM | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6 3/16/2013 3:11:55 PM | | Data directory: C:\ProgramData\BOINC 3/16/2013 3:11:55 PM | | Running under account Jazzman 3/16/2013 3:11:55 PM | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7 CPU X 980 @ 3.33GHz [Family 6 Model 44 Stepping 2] 3/16/2013 3:11:55 PM | | Processor: 256.00 KB cache 3/16/2013 3:11:55 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 syscall nx lm vmx tm2 popcnt aes pbe 3/16/2013 3:11:55 PM | | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 3/16/2013 3:11:55 PM | | Memory: 15.99 GB physical, 21.99 GB virtual 3/16/2013 3:11:55 PM | | Disk: 922.76 GB total, 715.68 GB free 3/16/2013 3:11:55 PM | | Local time is UTC -7 hours 3/16/2013 3:11:55 PM | | VirtualBox version: 4.1.18 3/16/2013 3:11:55 PM | | ATI GPU 0: Cypress (CAL version 1.4.1741, 1024MB, 991MB available, 4640 GFLOPS peak) 3/16/2013 3:11:55 PM | | ATI GPU 1: Cypress (CAL version 1.4.1741, 1024MB, 991MB available, 4640 GFLOPS peak) 3/16/2013 3:11:55 PM | | OpenCL: ATI GPU 0: Cypress (driver version 1124.2 (VM), device version OpenCL 1.2 AMD-APP (1124.2), 1024MB, 991MB available) 3/16/2013 3:11:55 PM | | OpenCL: ATI GPU 1: Cypress (driver version 1124.2 (VM), device version OpenCL 1.2 AMD-APP (1124.2), 1024MB, 991MB available) 3/16/2013 3:11:55 PM | | Config: report completed tasks immediately 3/16/2013 3:11:55 PM | | Config: GUI RPC allowed from: 3/16/2013 3:11:55 PM | | Config: 192.168.0.2 3/16/2013 3:11:55 PM | | Config: 192.168.0.3 3/16/2013 3:11:55 PM | | Config: 192.168.0.4 3/16/2013 3:11:55 PM | | Config: 192.168.0.5 3/16/2013 3:11:55 PM | | Config: 192.168.0.6 3/16/2013 3:11:55 PM | Test4Theory@Home | URL http://lhcathome2.cern.ch/test4theory/; Computer ID 22428; resource share 0 3/16/2013 3:11:55 PM | LHC@home 1.0 | URL http://lhcathomeclassic.cern.ch/sixtrack/; Computer ID 9990150; resource share 0 3/16/2013 3:11:55 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6644160; resource share 0 3/16/2013 3:11:55 PM | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 1926675; resource share 100 3/16/2013 3:11:55 PM | World Community Grid | General prefs: from World Community Grid (last modified 14-Mar-2013 16:47:48) 3/16/2013 3:11:55 PM | World Community Grid | Computer location: work 3/16/2013 3:11:55 PM | | General prefs: using separate prefs for work 3/16/2013 3:11:55 PM | | Preferences: 3/16/2013 3:11:55 PM | | max memory usage when active: 12281.17MB 3/16/2013 3:11:55 PM | | max memory usage when idle: 14737.40MB 3/16/2013 3:11:55 PM | | max disk usage: 50.00GB 3/16/2013 3:11:55 PM | | (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) 3/16/2013 3:11:55 PM | | Not using a proxy 3/16/2013 3:13:17 PM | | Suspending computation - initial delay X0930124900141201101191138_ 2-- Coltrane Error 3/17/13 04:33:09 3/17/13 23:31:07 0.01 / 6.53 62.1 / 0.0 X0930124900157201101191137_ 2-- Coltrane Error 3/17/13 04:33:09 3/17/13 23:31:07 0.01 / 6.53 62.1 / 0.0 =================================================BOINC MGR Event Log 3/17/2013 9:56:29 AM | World Community Grid | Starting task X0930124900157201101191137_2 using hcc1 version 705 (ati_hcc1) in slot 0 3/17/2013 4:28:03 PM | World Community Grid | task X0930124900157201101191137_2 aborted by user ================================================= Project World Community Grid Name X0930124900157201101191137_2 Application hcc1 7.05 (ati_hcc1) Workunit name X0930124900157201101191137 State Running Received 3/16/2013 9:33:12 PM Report deadline 3/19/2013 4:45:09 PM Estimated app speed 45.92 GFLOPs/sec Estimated task size 24,216 GFLOPs Resources 1 CPUs + 1 ATI GPU (device 0) CPU time at last checkpoint 00:00:00 CPU time 00:00:23 Elapsed time 06:29:50 Estimated time remaining -- Fraction done 0.000% Virtual memory size 123.27 MB Working set size 80.12 MB Directory slots/0 Process ID 2700 ================================================= =BOINC MGR Event Log 3/17/2013 9:56:31 AM | World Community Grid | Starting task X0930124900141201101191138_2 using hcc1 version 705 (ati_hcc1) in slot 7 3/17/2013 4:28:15 PM | World Community Grid | task X0930124900141201101191138_2 aborted by user ================================================= Project World Community Grid Name X0930124900141201101191138_2 Application hcc1 7.05 (ati_hcc1) Workunit name X0930124900141201101191138 State Running Received 3/16/2013 9:33:12 PM Report deadline 3/19/2013 4:45:09 PM Estimated app speed 45.92 GFLOPs/sec Estimated task size 24,216 GFLOPs Resources 1 CPUs + 1 ATI GPU (device 1) CPU time at last checkpoint 00:00:00 CPU time 00:00:22 Elapsed time 06:29:49 Estimated time remaining -- Fraction done 0.000% Virtual memory size 121.13 MB Working set size 78.53 MB Directory slots/7 Process ID 1128 Result Log Result Name: X0930124900141201101191138_ 2-- <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> Commandline: projects/www.worldcommunitygrid.org/wcg_hcc1_img_7.05_windows_intelx86__ati_hcc1 --zipfile X0930124900141201101191138.zip --imagelist images.txt --device 1 <app_init_data> <major_version>7</major_version> <minor_version>0</minor_version> <release>28</release> <app_version>705</app_version> <app_name>hcc1</app_name> <project_preferences> <color_scheme>Tahiti Sunset</color_scheme> <max_frames_sec>2000</max_frames_sec> <max_gfx_cpu_pct>100.0</max_gfx_cpu_pct> </project_preferences> <project_dir>C:\ProgramData\BOINC/projects/www.worldcommunitygrid.org</project_dir> <boinc_dir>C:\ProgramData\BOINC</boinc_dir> <wu_name>X0930124900141201101191138</wu_name> <result_name>X0930124900141201101191138_2</result_name> <comm_obj_name>boinc_5</comm_obj_name> <slot>7</slot> <wu_cpu_time>0.000000</wu_cpu_time> <starting_elapsed_time>0.000000</starting_elapsed_time> <using_sandbox>0</using_sandbox> <user_total_credit>9086172.368075</user_total_credit> <user_expavg_credit>17752.439470</user_expavg_credit> <host_total_credit>4097885.026987</host_total_credit> <host_expavg_credit>10258.968641</host_expavg_credit> <resource_share_fraction>1.000000</resource_share_fraction> <checkpoint_period>60.000000</checkpoint_period> <fraction_done_start>0.000000</fraction_done_start> <fraction_done_end>1.000000</fraction_done_end> <gpu_type>ATI</gpu_type> <gpu_device_num>1</gpu_device_num> <gpu_opencl_dev_index>1</gpu_opencl_dev_index> <ncpus>1.000000</ncpus> <rsc_fpops_est>24215727646969.000000</rsc_fpops_est> <rsc_fpops_bound>1210786382348450.000000</rsc_fpops_bound> <rsc_memory_bound>78643200.000000</rsc_memory_bound> <rsc_disk_bound>50000000.000000</rsc_disk_bound> <computation_deadline>1363736529.000000</computation_deadline> <vbox_window>0</vbox_window> </app_init_data> INFO: gpu_type set in init_data.xml to ATI INFO: gpu_device_num set in init_data.xml to 1 Boinc requested ATI gpu device number1 Unzipping input images ../../projects/www.worldcommunitygrid.org/X0930124900141201101191138_X0930124900141201101191138.zip Processing jobdescription Number of Images defined in image list is 2 Found compute platform Advanced Micro Devices, Inc. Selecting this platform CL_DEVICE_NAME: Cypress CL_DEVICE_VENDOR: Advanced Micro Devices, Inc. CL_DEVICE_VERSION: 1124.2 (VM) CL_DEVICE_MAX_COMPUTE_UNITS: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 / 256 / 256 CL_DEVICE_MAX_WORK_GROUP_SIZE: 256 CL_DEVICE_MAX_CLOCK_FREQUENCY: 725 MHz CL_DEVICE_ADDRESS_BITS: 32 CL_DEVICE_MAX_MEM_ALLOC_SIZE: 512 MByte CL_DEVICE_GLOBAL_MEM_SIZE: 1024 MByte CL_DEVICE_ERROR_CORRECTION_SUPPORT: no CL_DEVICE_LOCAL_MEM_TYPE: local CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE CL_DEVICE_EXTENSIONS: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing Estimated kernel execution time = 0.46595 [sec] Starting analysis of X0930124900141201101191138.jp2... Extracting GLCM features... </stderr_txt> ]]> Result Log Result Name: X0930124900157201101191137_ 2-- <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> Commandline: projects/www.worldcommunitygrid.org/wcg_hcc1_img_7.05_windows_intelx86__ati_hcc1 --zipfile X0930124900157201101191137.zip --imagelist images.txt --device 0 <app_init_data> <major_version>7</major_version> <minor_version>0</minor_version> <release>28</release> <app_version>705</app_version> <app_name>hcc1</app_name> <project_preferences> <color_scheme>Tahiti Sunset</color_scheme> <max_frames_sec>2000</max_frames_sec> <max_gfx_cpu_pct>100.0</max_gfx_cpu_pct> </project_preferences> <project_dir>C:\ProgramData\BOINC/projects/www.worldcommunitygrid.org</project_dir> <boinc_dir>C:\ProgramData\BOINC</boinc_dir> <wu_name>X0930124900157201101191137</wu_name> <result_name>X0930124900157201101191137_2</result_name> <comm_obj_name>boinc_1</comm_obj_name> <slot>0</slot> <wu_cpu_time>0.000000</wu_cpu_time> <starting_elapsed_time>0.000000</starting_elapsed_time> <using_sandbox>0</using_sandbox> <user_total_credit>9086172.368075</user_total_credit> <user_expavg_credit>17752.439470</user_expavg_credit> <host_total_credit>4097885.026987</host_total_credit> <host_expavg_credit>10258.968641</host_expavg_credit> <resource_share_fraction>1.000000</resource_share_fraction> <checkpoint_period>60.000000</checkpoint_period> <fraction_done_start>0.000000</fraction_done_start> <fraction_done_end>1.000000</fraction_done_end> <gpu_type>ATI</gpu_type> <gpu_device_num>0</gpu_device_num> <gpu_opencl_dev_index>0</gpu_opencl_dev_index> <ncpus>1.000000</ncpus> <rsc_fpops_est>24215727646969.000000</rsc_fpops_est> <rsc_fpops_bound>1210786382348450.000000</rsc_fpops_bound> <rsc_memory_bound>78643200.000000</rsc_memory_bound> <rsc_disk_bound>50000000.000000</rsc_disk_bound> <computation_deadline>1363736529.000000</computation_deadline> <vbox_window>0</vbox_window> </app_init_data> INFO: gpu_type set in init_data.xml to ATI INFO: gpu_device_num set in init_data.xml to 0 Boinc requested ATI gpu device number0 Unzipping input images ../../projects/www.worldcommunitygrid.org/X0930124900157201101191137_X0930124900157201101191137.zip Processing jobdescription Number of Images defined in image list is 2 Found compute platform Advanced Micro Devices, Inc. Selecting this platform CL_DEVICE_NAME: Cypress CL_DEVICE_VENDOR: Advanced Micro Devices, Inc. CL_DEVICE_VERSION: 1124.2 (VM) CL_DEVICE_MAX_COMPUTE_UNITS: CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3 CL_DEVICE_MAX_WORK_ITEM_SIZES: 256 / 256 / 256 CL_DEVICE_MAX_WORK_GROUP_SIZE: 256 CL_DEVICE_MAX_CLOCK_FREQUENCY: 725 MHz CL_DEVICE_ADDRESS_BITS: 32 CL_DEVICE_MAX_MEM_ALLOC_SIZE: 512 MByte CL_DEVICE_GLOBAL_MEM_SIZE: 1024 MByte CL_DEVICE_ERROR_CORRECTION_SUPPORT: no CL_DEVICE_LOCAL_MEM_TYPE: local CL_DEVICE_LOCAL_MEM_SIZE: 32 KByte CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64 KByte CL_DEVICE_QUEUE_PROPERTIES: CL_QUEUE_PROFILING_ENABLE CL_DEVICE_EXTENSIONS: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing Estimated kernel execution time = 0.37351 [sec] Starting analysis of X0930124900157201101191137.jp2... Extracting GLCM features... </stderr_txt> ]]> |
||
|
RaymondFO
Veteran Cruncher USA Joined: Nov 30, 2004 Post Count: 561 Status: Offline Project Badges: |
This happens to me when the ATI video driver crashes and recovers. Except upon recovery, the GPU tasks never crunches again until you reboot the computer. If this occurs frequently, you may want to uninstall the driver and reinstall the driver. Please remember to reboot upon completing uninstalling the driver, and again rebooting upon completing the re-installation of the video driver so the new driver will be fully operational.
|
||
|
captainjack
Advanced Cruncher Joined: Apr 14, 2008 Post Count: 144 Status: Offline Project Badges: |
This happens to me when BOINC runs CPU benchmarks while a HCC GPU task is running. It happens to me about once a week. You should be able to look back through the Event Log Messages and see if CPU Benchmarks ran while the errant WU were running.
If I remember correctly, the WCG admins know about this and are looking into it. In the mean time, all we can do is abort the stuck tasks and start another one. |
||
|
|