Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 38
Posts: 38   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4431 times and has 37 replies Next Thread
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Seeing Code 231 and 232 HCC-GPU 7.08 Errors

I have a single Linux machine running HCC-GPU 7.08 with GFAM-CPU that had 9 HCC-GPU tasks Error. In all cases at least 1 wingman also Errored. The messages are code 231 - ERROR: VerifyGPU.cpp:114 Device not found. and 232 - ERROR: VerifyGPU.cpp:65 Unknown.

The device has a number of Validated and PV HCC-GPU and GFAM tasks as well.

Anyone else running into this?

OS: Linux (Mint 13)
Client: 7.0.36
GPU: AMD/ATI

Result Status
-------------
X0960083380015200702091440_ 3-- 708 Valid 12/8/12 15:35:47 12/8/12 15:53:17 0.04 64.3 / 75.2
X0960083380015200702091440_ 2-- 708 Error 12/8/12 15:29:28 12/8/12 15:35:33 0.00 62.6 / 0.0 (code 231 (0xe7, -25))
X0960083380015200702091440_ 1-- 708 Error 12/8/12 10:49:38 12/8/12 15:23:12 0.00 62.6 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0960083380015200702091440_ 0-- 708 Valid 12/8/12 10:49:28 12/8/12 11:07:40 0.05 86.1 / 75.2

X0960083380019200702091440_ 3-- - In Progress 12/8/12 15:43:13 12/11/12 10:55:13 0.00 0.0 / 0.0
X0960083380019200702091440_ 2-- 708 Error 12/8/12 15:35:34 12/8/12 15:40:57 0.00 62.6 / 0.0 (code 231 (0xe7, -25))
X0960083380019200702091440_ 0-- 708 Error 12/8/12 10:49:38 12/8/12 15:34:39 0.00 62.6 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0960083380019200702091440_ 1-- 708 Pending Validation 12/8/12 10:49:28 12/8/12 11:08:14 0.05 79.3 / 0.0

X0900083390709200702221725_ 4-- 708 Pending Validation 12/8/12 05:50:11 12/8/12 10:47:58 0.05 62.6 / 0.0
X0900083390709200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390709200702221725_ 2-- 708 Error 12/8/12 03:33:48 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390709200702221725_ 0-- 708 Error 12/8/12 01:31:48 12/8/12 05:46:32 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390709200702221725_ 1-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

X0900083390707200702221725_ 4-- 708 Pending Validation 12/8/12 05:50:11 12/8/12 10:47:58 0.05 62.6 / 0.0
X0900083390707200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390707200702221725_ 2-- 708 Error 12/8/12 03:33:51 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390707200702221725_ 1-- 708 Error 12/8/12 01:31:48 12/8/12 05:46:32 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390707200702221725_ 0-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

X0900083390710200702221725_ 4-- 708 Pending Validation 12/8/12 06:06:33 12/8/12 06:38:53 0.04 62.5 / 0.0
X0900083390710200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390710200702221725_ 2-- 708 Error 12/8/12 03:33:51 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390710200702221725_ 0-- 708 Error 12/8/12 01:31:48 12/8/12 06:05:21 0.00 62.5 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390710200702221725_ 1-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

X0900083390711200702221725_ 4-- 708 Pending Validation 12/8/12 05:50:11 12/8/12 10:47:58 0.05 62.6 / 0.0
X0900083390711200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390711200702221725_ 2-- 708 Error 12/8/12 03:33:48 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390711200702221725_ 0-- 708 Error 12/8/12 01:31:48 12/8/12 05:46:32 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390711200702221725_ 1-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

X0900083390719200702221725_ 4-- 708 Pending Validation 12/8/12 06:06:33 12/8/12 06:31:14 0.04 62.4 / 0.0
X0900083390719200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390719200702221725_ 2-- 708 Error 12/8/12 03:33:49 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390719200702221725_ 0-- 708 Error 12/8/12 01:31:48 12/8/12 06:05:21 0.00 62.5 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390719200702221725_ 1-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

X0900083390713200702221725_ 4-- 708 Pending Validation 12/8/12 05:56:42 12/8/12 07:35:12 0.03 57.9 / 0.0
X0900083390713200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390713200702221725_ 2-- 708 Error 12/8/12 03:33:48 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390713200702221725_ 1-- 708 Error 12/8/12 01:31:48 12/8/12 05:55:00 0.00 62.5 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390713200702221725_ 0-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

X0900083390720200702221725_ 4-- 708 Pending Validation 12/8/12 05:56:43 12/8/12 07:35:12 0.03 61.3 / 0.0
X0900083390720200702221725_ 3-- - In Progress 12/8/12 05:04:52 12/11/12 00:16:52 0.00 0.0 / 0.0
X0900083390720200702221725_ 2-- 708 Error 12/8/12 03:33:51 12/8/12 04:57:56 0.00 62.4 / 0.0 (process exited with code 232 (0xe8, -24))
X0900083390720200702221725_ 1-- 708 Error 12/8/12 01:31:48 12/8/12 05:54:59 0.00 62.5 / 0.0 (process exited with code 232 (0xe8, -24)) <---Mine
X0900083390720200702221725_ 0-- 708 Error 12/8/12 01:31:38 12/8/12 01:46:21 0.00 62.4 / 0.0 (code 231 (0xe7, -25))

stderr_txt snip
---------------

<message>
process exited with code 231 (0xe7, -25)
</message>
<stderr_txt>
...
Processing jobdescription
Number of Images defined in image list is 2
No protocol specified
Found compute platform Advanced Micro Devices, Inc.
Selecting this platform
ERROR: VerifyGPU.cpp:114 Device not found.
03:31:45 (22568): called boinc_finish

</stderr_txt>

and

<message>
process exited with code 232 (0xe8, -24)
</message>
<stderr_txt>
...
Processing jobdescription
Number of Images defined in image list is 2
ERROR: VerifyGPU.cpp:65 Unknown
23:50:53 (19580): called boinc_finish

</stderr_txt>
----------------------------------------
Bill P

[Dec 8, 2012 4:52:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Same "ERROR: VerifyGPU.cpp:114 Device not found." reported day before yesterday in other thread. In fact, when I searched for the "seen this before", also saw the "ERROR: VerifyGPU.cpp:65 Unknown" appearing... few reports, and given this is Linux, not an exclusive to Windows anymore. Have not looked at whether there was an ATI only correlation, or if they also show up for NVidia GPU's.
[Dec 8, 2012 5:02:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Yes, I too am getting the 232 errors (20 of them this mornng) on an ATI 7750 running Xubuntu 11.10 and BOINC 7.0.27.
A typical log is below.

I recall getting one such error during the beta test out of 120+ results returned, but I'm getting 30% or more errors today.



Result Log

Result Name: X0930083420661200702091641_ 1--



<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 232 (0xe8, -24)
</message>
<stderr_txt>
Commandline: ../../projects/www.worldcommunitygrid.org/wcg_hcc1_img_7.08_i686-pc-linux-gnu__ati_hcc1 --zipfile X0930083420661200702091641.zip --imagelist images.txt --device 0
<app_init_data>
<major_version>7</major_version>
<minor_version>0</minor_version>
<release>27</release>
<app_version>708</app_version>
<app_name>hcc1</app_name>
<project_preferences>


<color_scheme>Tahiti Sunset</color_scheme>
<max_frames_sec>7</max_frames_sec>
<max_gfx_cpu_pct>5.0</max_gfx_cpu_pct>
</project_preferences>

<project_dir>/var/lib/boinc-client/projects/www.worldcommunitygrid.org</project_dir>
<boinc_dir>/var/lib/boinc-client</boinc_dir>
<wu_name>X0930083420661200702091641</wu_name>
<result_name>X0930083420661200702091641_1</result_name>
<shm_key>-1</shm_key>
<slot>1</slot>
<wu_cpu_time>0.000000</wu_cpu_time>
<starting_elapsed_time>0.000000</starting_elapsed_time>
<using_sandbox>0</using_sandbox>
<user_total_credit>3051757.045270</user_total_credit>
<user_expavg_credit>11402.387918</user_expavg_credit>
<host_total_credit>574672.404980</host_total_credit>
<host_expavg_credit>5571.473373</host_expavg_credit>
<resource_share_fraction>1.000000</resource_share_fraction>
<checkpoint_period>60.000000</checkpoint_period>
<fraction_done_start>0.000000</fraction_done_start>
<fraction_done_end>1.000000</fraction_done_end>
<gpu_type>ATI</gpu_type>
<gpu_device_num>0</gpu_device_num>
<gpu_opencl_dev_index>0</gpu_opencl_dev_index>
<ncpus>1.000000</ncpus>
<rsc_fpops_est>25635091809478.000000</rsc_fpops_est>
<rsc_fpops_bound>512701836189560.000000</rsc_fpops_bound>
<rsc_memory_bound>78643200.000000</rsc_memory_bound>
<rsc_disk_bound>50000000.000000</rsc_disk_bound>
<computation_deadline>1355534489.000000</computation_deadline>
</app_init_data>
INFO: gpu_type set in init_data.xml to ATI
INFO: gpu_device_num set in init_data.xml to 0
Boinc requested ATI gpu device number0
Unzipping input images ../../projects/www.worldcommunitygrid.org/X0930083420661200702091641_X0930083420661200702091641.zip
Processing jobdescription
Number of Images defined in image list is 2
ERROR: VerifyGPU.cpp:65 Unknown
12:23:11 (10756): called boinc_finish

</stderr_txt>
]]>
----------------------------------------

[Dec 8, 2012 7:23:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Yes, I too am getting the 232 errors (20 of them this mornng) on an ATI 7750 running Xubuntu 11.10 and BOINC 7.0.27.
A typical log is below.

I recall getting one such error during the beta test out of 120+ results returned, but I'm getting 30% or more errors today.

...

I've completed 155 on the one machine today with 14 Invalid (code 232), about 9%. I just setup another Linux box and started processing HCC-GPU WUs on it to see if the problem carries over to the new box.

Looking at the sent times on the Invalids they came in 2 groups, 11 and then 3 and were all in sequence without any Valids in between.

I had 200 Betas that were all valid.
----------------------------------------
Bill P

[Dec 8, 2012 8:51:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mmstick
Senior Cruncher
Joined: Aug 19, 2010
Post Count: 151
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Once these problems get resolved, it would be nice to see how it compares to running in Linux. Although I figure that the state of OpenCL drivers in Linux is still worse than in Windows. But Linux does present much faster computational power for the CPU.
[Dec 10, 2012 11:49:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Bill P, do you have an update? Have you resolved the error problem on your first Linux machine, and what happened with the second one?

I'm still getting errors in batches. I'm trying to think of anything that changed on my machine between the almost error-free beta run and the start of the regular HCC GPU Linux WUs. There was at least one kernel update, and I was running a different mix of WCG CPU sciences. I've played with changing which CPU sciences are running, and that hasn't solved the problem for me. I may have to give up on running HCC GPU. :(
----------------------------------------

[Dec 11, 2012 11:31:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Bill P, do you have an update? Have you resolved the error problem on your first Linux machine, and what happened with the second one?

I'm still getting errors in batches. I'm trying to think of anything that changed on my machine between the almost error-free beta run and the start of the regular HCC GPU Linux WUs. There was at least one kernel update, and I was running a different mix of WCG CPU sciences. I've played with changing which CPU sciences are running, and that hasn't solved the problem for me. I may have to give up on running HCC GPU. :(

I have an update, although not one I like. I installed an NVidia (GTX 460) in a third Linux box, installed the "Additional" drivers and its been running without error for over 24 hrs. I swapped the failing AMD card (R6670) to a different Linux box and the problem followed it. I then replaced it with a GTX 465 and the errors stopped.

The FAQs point to a driver problem which I found hard to beleive since it wasn't a hard problem and only occurred about 20% of the time. As you, I did 1 kernal update and my Betas ran error free as well.

So short version is that NVidia runs fine on my Linux boxes and AMD has the 20-30% error rate. What I did was move my NVidia cards to my Linux boxes and put the AMDs in Windows boxes, where they are running fine.

The only thing I can think of is that the Linux AMD drivers have a problem.

OS: Ubuntu 10.4 & Mint 13
Drivers: "Additional Drivers - Current Version" as tested by Ubuntu & Mint
WU Mix: GFAM, HCC (CPU & GPU), SNTS, CEP2 & HPF
----------------------------------------
Bill P

----------------------------------------
[Edit 1 times, last edit by wplachy at Dec 12, 2012 2:20:52 AM]
[Dec 12, 2012 2:18:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

Thanks for the update, Bill (although needless to say, it wasn't what I wanted to hear, either).

My set up: Xubuntu 12.04.1, BOINC 7.0.27, fglrx-updates driver

I have heard of people having trouble with the fglrx-updates driver; I guess I'll try stepping down to just fglrx and see if that helps.

POEM GPU WUs are still running just fine on this machine and they require OpenCL. So the problem seems to be specific to HCC GPU.
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by kateiacy at Dec 12, 2012 3:08:00 AM]
[Dec 12, 2012 2:41:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
wplachy
Senior Cruncher
Joined: Sep 4, 2007
Post Count: 423
Status: Offline
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

...I wonder whether something went wrong with the kernel updates that the driver didn't kick in correctly. Maybe I'll try reinstalling the driver.
....

Please let me know if that helps you. I tried that b4 I started swapping cards around and it didn't make a difference.

Good luck!!
----------------------------------------
Bill P

[Dec 12, 2012 3:00:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Seeing Code 231 and 232 HCC-GPU 7.08 Errors

...I wonder whether something went wrong with the kernel updates that the driver didn't kick in correctly. Maybe I'll try reinstalling the driver.
....

Please let me know if that helps you. I tried that b4 I started swapping cards around and it didn't make a difference.

Good luck!!


I just uninstalled fglrx-updates and installed fglrx. I'll let HCC run overnight and see what it's done by morning!
----------------------------------------

[Dec 12, 2012 3:34:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 38   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread