Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 68
Posts: 68   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 15806 times and has 67 replies Next Thread
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

Already 1 error on the new run.

Result Log

Result Name: BETA_ X0000038191504200409181608_ 1--



<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code 293 (0x125)
</message>
<stderr_txt>
Commandline: projects/www.worldcommunitygrid.org/wcg_beta3_img_6.47_windows_intelx86__ati14_hcc1 X0000038191504200409181608.jp2
INFO: gpu_type not found in init_data.xml.
INFO: GPU device not specified in init_data.xml. Checking Commandline.
ERROR: GPU device not specified in init_data.xml or on command line. Exiting.
14:29:40 (1224): called boinc_finish

</stderr_txt>
]]>

Change to require 1 CPU in addition to the GPU (this causes the client to run the app at IDLE priority class rather than NORMAL)
<-- That is an excellent addition. That way crunchers don't have to woryy about having an idle CPU core to run the tasks. Well done techs. dancing
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 1 times, last edit by nanoprobe at Jul 12, 2012 6:42:27 PM]
[Jul 12, 2012 6:33:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

@nanoprobe - can you paste the details from the client_state.xml about the app version?
[Jul 12, 2012 6:37:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
X-Files 27
Senior Cruncher
Canada
Joined: May 21, 2007
Post Count: 391
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

I had that error as well:
Result Name: BETA_ X0000038190329200409181628_ 1--
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code 293 (0x125)
</message>
<stderr_txt>
Commandline: projects/www.worldcommunitygrid.org/wcg_beta3_img_6.47_windows_intelx86__cuda_hcc1 X0000038190329200409181628.jp2
INFO: gpu_type not found in init_data.xml.
INFO: GPU device not specified in init_data.xml. Checking Commandline.
ERROR: GPU device not specified in init_data.xml or on command line. Exiting.
12:31:02 (6508): called boinc_finish

</stderr_txt>
]]>

<app_version>
<app_name>beta3</app_name>
<version_num>647</version_num>
<platform>windows_intelx86</platform>
<avg_ncpus>1.000000</avg_ncpus>
<max_ncpus>0.825499</max_ncpus>
<flops>91758405031.424271</flops>
<plan_class>cuda_hcc1</plan_class>
<api_version>7.1.0</api_version>
<file_ref>
<file_name>wcg_beta3_img_6.47_windows_intelx86__cuda_hcc1</file_name>
<main_program/>
</file_ref>
<file_ref>
<file_name>hcckernel.cl.6.47</file_name>
<open_name>hcckernel.cl</open_name>
</file_ref>
<file_ref>
<file_name>beta3_image01_6.47.tga</file_name>
<open_name>boinc_wcg_skin_w-logo.tga</open_name>
</file_ref>
<file_ref>
<file_name>beta3_image02_6.47.tga</file_name>
<open_name>pbibmablk.tga</open_name>
</file_ref>
<file_ref>
<file_name>beta3_image03_6.47.tga</file_name>
<open_name>UHN_stacked2.tga</open_name>
</file_ref>
<file_ref>
<file_name>beta3_image04_6.47.tga</file_name>
<open_name>HCC_LOGO.tga</open_name>
</file_ref>
<coproc>
<type>CUDA</type>
<count>1.000000</count>
</coproc>
</app_version>

----------------------------------------

[Jul 12, 2012 6:41:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

@nanoprobe - can you paste the details from the client_state.xml about the app version?

I can't remote into that machine from work. I'll post it when I get home. Now there is another issue. The completed tasks stopped uploading. The message tab says Maintenance underway: file uploads are temporarily disabled.
Looks like uploads have resumed for me.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 2 times, last edit by nanoprobe at Jul 12, 2012 6:58:00 PM]
[Jul 12, 2012 6:46:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

I can't remote into that machine from work. I'll post it when I get home. Now there is another issue. The completed tasks stopped uploading. The message tab says Maintenance underway: file uploads are temporarily disabled.

That's because of this http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,33448#384681
----------------------------------------
Bill P

[Jul 12, 2012 6:57:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pirogue
Veteran Cruncher
USA
Joined: Dec 8, 2008
Post Count: 685
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

Sound like folks maxing out all threads are having issues and those leaving at least one thread free is not having issues. For those who are maxing out your threads, when the gpu finishes a wu and passes to the CPU, either one of those threads will have to pause to finish the gpu wu, or the gpu work unit getting passed back to the CPU will most likely error out due to nothing available to finish the job. Could be OCing issues to. I don't OC but do leave one thread free and didnt get any errors. Just a theory on my part.


We have been discussing some of the observed items with David Anderson and there is a fact that we didn't understand. If the estimated cpu use for a GPU application version < 1 full core, then the BOINC client starts the research application at normal priority. Given the 30-45 seconds at the start and end of each workunit, this can interfere with user interaction on their machine. As a result, the next beta test we run will have it set to use 1 full core + 1 gpu. Those users who have set their machines to have an extra cpu slot available for this gpu app do not need to do that anymore.

I've had to stop running normal tasks because our electric rates are outrageous. I'm seeing the unresponsiveness/jerkiness running only 1 Beta GPU task. The interference starts immediately and ends at around 99.5%.

edit: system is 2600K, 8GB RAM, and a 5450.
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by pirogue at Jul 12, 2012 7:00:28 PM]
[Jul 12, 2012 6:58:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BSD
Senior Cruncher
Joined: Apr 27, 2011
Post Count: 224
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

@knreed

These values two low, should be at least 256?

CL_DEVICE_MAX_WORK_ITEM_SIZES: 128 / 128 / 128
CL_DEVICE_MAX_WORK_GROUP_SIZE: 128

My cedar GPU getting "Invalid" results when validated/compared with wingmen that have a setting of 256.
[Jul 12, 2012 6:58:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jim1348
Veteran Cruncher
USA
Joined: Jul 13, 2009
Post Count: 1066
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

Despite leaving one processor free (which sounds like it should not be a requirement if I understood correctly) I still failed with RC232 ...
Sometimes the memory can go bad in a graphics card. They recommend MemtestCL on the Folding forum:
https://simtk.org/home/memtest

Also, it wouldn't hurt to do a memtest on you main memory. But your temps are fine, and it doesn't appear that your card is factory-overclocked either from what I can tell. (Sometimes people think that just because the card manufacturer overclocks it, it is safe, but it is the chip specifications that count for safe-crunching purposes.)
[Jul 12, 2012 6:58:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

I can't remote into that machine from work. I'll post it when I get home. Now there is another issue. The completed tasks stopped uploading. The message tab says Maintenance underway: file uploads are temporarily disabled.

That's because of this http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,33448#384681

Thanks Bill. Missed that. biggrin
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Jul 12, 2012 6:59:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bearcat
Master Cruncher
USA
Joined: Jan 6, 2007
Post Count: 2803
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC GPU Beta (July 9, 2012)

We have been discussing some of the observed items with David Anderson and there is a fact that we didn't understand. If the estimated cpu use for a GPU application version < 1 full core, then the BOINC client starts the research application at normal priority. Given the 30-45 seconds at the start and end of each workunit, this can interfere with user interaction on their machine. As a result, the next beta test we run will have it set to use 1 full core + 1 gpu. Those users who have set their machines to have an extra cpu slot available for this gpu app do not need to do that anymore.

Thought about asking if there is a way you can send betas to systems that have at least one thread free. But seeing folks having errors that do leave one free kills my theory. Must be something else then. My ATI 6950 was flashed to become a 6970 and still didn't have issues. Is it specific cards producing errors? Are the cards serverly OC'ed causing errors? Definitely weird.
----------------------------------------
Crunching for humanity since 2007!

[Jul 12, 2012 7:20:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 68   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread