Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 30
Posts: 30   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2578 times and has 29 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

Coleslaw, ok for hard disk strain, I understand it, but I'm not talking about laptops with cheap Hitachi disks, but more or less powerful desktops. I'm also sure that it is a machine problem, since other machines I have crunch CEP2 with no problems.

For the time being I excluded that machine from CEP WUs until I find something certain about it.
[Aug 13, 2011 1:03:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Coleslaw
Veteran Cruncher
USA
Joined: Mar 29, 2007
Post Count: 1343
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

My example was merely a laptop. I have had similar problems on desktops as well. Swapping the hard drives fixed those systems. So please don't discard it based on my example.
----------------------------------------

[Aug 13, 2011 2:55:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mclaver
Veteran Cruncher
Joined: Dec 19, 2005
Post Count: 566
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

Coleslaw, ok for hard disk strain, I understand it, but I'm not talking about laptops with cheap Hitachi disks, but more or less powerful desktops. I'm also sure that it is a machine problem, since other machines I have crunch CEP2 with no problems.

For the time being I excluded that machine from CEP WUs until I find something certain about it.


I do not think this is a machine problem. although I have 20 machines and it is now only happening on 2. The two machines are relatively new, One is an I7 950 and one is an AMD 1100, with relatively new 320gb Seagate Disk drives. One is running Win 7 and one has Ubuntu. All my machines are now dedicated to CEP, but on these two machines I have received over 300 errors since yesterday. I have reset both machines to see if that fixes the problem. Here is the error I am getting.

Result Log

Result Name: E202933_ 266_ C.27.C21H10N2OS2Se.00461683.3.set1d06_ 0--
<core_client_version>6.10.59</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[20:12:20] Number of jobs = 16
[20:12:20] Starting job 0,CPU time has been restored to 0.000000.
[20:12:20] Starting new Job
Application exited with RC = 0x100
[ERROR] Failed to open either source or destination files while copying C.27.C21H10N2OS2Se.00461683.3.noopt.bp86.sto6g.n.sp/53.0 to C.27.C21H10N2OS2Se.00461683.3.noopt.bp86.sto6g.n.sp.53.0. Error: 2
[20:12:22] Finished Job #0
20:12:22 (21647): called boinc_finish

</stderr_txt>
----------------------------------------



[Aug 13, 2011 3:19:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

mclaver, you may have hit the operational envelope limits. Remember, the default is 1 per machine with the possibility to up it in steps of 1 to 16 and then unlimited. Suggest you slow the CEP2, and of course not to speak of OCing. 4 or more rattling away is very serious IO.

Mind you, running here error-less for CEP2 on Ubuntu for a long long time. My cache is set to mix with CW [10 CEP2 allowed on a 1 day cache], but at times, just as work is available in the feeder, it does happen that all 4 cores run this science concurrent and never failing then too. It's just that the efficiency at 4 starts to drop multiple percentage points. At 3 of 4 cores get 96-97. At 4 get 92-93%.

--//--
[Aug 13, 2011 3:32:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Coleslaw
Veteran Cruncher
USA
Joined: Mar 29, 2007
Post Count: 1343
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

I have also found better results running a second hard drive with the BOINC data file on the spare hard drive separate from the OS. Especially if you do other hard drive intensive tasks like Bit Torrents.
----------------------------------------

[Aug 13, 2011 6:56:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mclaver
Veteran Cruncher
Joined: Dec 19, 2005
Post Count: 566
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

mclaver, you may have hit the operational envelope limits. Remember, the default is 1 per machine with the possibility to up it in steps of 1 to 16 and then unlimited. Suggest you slow the CEP2, and of course not to speak of OCing. 4 or more rattling away is very serious IO.

Mind you, running here error-less for CEP2 on Ubuntu for a long long time. My cache is set to mix with CW [10 CEP2 allowed on a 1 day cache], but at times, just as work is available in the feeder, it does happen that all 4 cores run this science concurrent and never failing then too. It's just that the efficiency at 4 starts to drop multiple percentage points. At 3 of 4 cores get 96-97. At 4 get 92-93%.

--//--


Since I have over 30 years of contribution for every active project, and I had only 25 years for CEP2, I thought I would direct all of my processing to CEP2 and remove the restriction on the number of active tasks. I know the penalty I will pay, for loss of efficiency. I have 6 ubuntu machines, 14 windows of various flavors, and am only having issues on the AMD 1100. Not sure if there is something unique with that machine.

With my mix of machines I have 2, 4, 6, 8, and 12 active tasks with hyper threading. Some machines have SSD, Raid 0 and Raid 10. I would think SSD and raid would help with the IO. The I7 990 and I7 980, with 12 active tasks do not seem to have a problem, although I know I am paying a penalty.

If a set CEP2 for a maximum of 3 will that improve efficiency. Can I go higher with SSD or Raid with striping. I have a 20 Mb Internet connection so that is not a problem.

What does error code 195 actually mean.

Result Log

Result Name: E202945_ 621_ A.27.C20H13N3OS2Si.351.4.set1d06_ 0--
<core_client_version>6.10.59</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[08:15:44] Number of jobs = 16
[08:15:44] Starting job 0,CPU time has been restored to 0.000000.
[08:15:44] Starting new Job
Application exited with RC = 0x100
[ERROR] Failed to open either source or destination files while copying A.27.C20H13N3OS2Si.351.4.noopt.bp86.sto6g.n.sp/53.0 to A.27.C20H13N3OS2Si.351.4.noopt.bp86.sto6g.n.sp.53.0. Error: 2
[08:15:48] Finished Job #0
08:15:48 (6189): called boinc_finish

</stderr_txt>
]]>
close
----------------------------------------



[Aug 14, 2011 10:41:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

Don't exactly know, but it's without the minus sign, so it's something system related. "Failed to open either source or destination files while copying" suggests I/O problems. A CEP2 task has some 6600 files in the slot, so multiple that by the number of concurrent and you'll know the potential strain.

As I noted, my quad core Linux at stock 2.4Ghz, has 10 in cache, rest CW and the gives on average 3 concurrent. There's no hard guideline what can be done with each device. Many discussion have gone before. For sure visit the optimization sheet that cleanenergy published for best settings.

BTW, this project will run at least till end of next year... no need to let it scream, the computers that is. The desired number concurrent is easiest achieved by setting a very low cache, then set for instance 4 CEP2 for the Quad/HT device. 50% is a good number and with the number of cores at your disposal you'll be there in no time.

--//--
[Aug 15, 2011 12:47:01 AM]   Link   Report threatening or abusive post: please login first  Go to top 
mclaver
Veteran Cruncher
Joined: Dec 19, 2005
Post Count: 566
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

Don't exactly know, but it's without the minus sign, so it's something system related. "Failed to open either source or destination files while copying" suggests I/O problems. A CEP2 task has some 6600 files in the slot, so multiple that by the number of concurrent and you'll know the potential strain.

As I noted, my quad core Linux at stock 2.4Ghz, has 10 in cache, rest CW and the gives on average 3 concurrent. There's no hard guideline what can be done with each device. Many discussion have gone before. For sure visit the optimization sheet that cleanenergy published for best settings.

BTW, this project will run at least till end of next year... no need to let it scream, the computers that is. The desired number concurrent is easiest achieved by setting a very low cache, then set for instance 4 CEP2 for the Quad/HT device. 50% is a good number and with the number of cores at your disposal you'll be there in no time.

--//--


It looks like my problem may not be CEP2. I changed the Device profile for my AMD 1100 to not process CEP2. It looks like it is only running HFCC and they are all failing with computation error.

It is an AMD 1100 running UBUNTU 11.04, Boinc 6.10.50. It has been running fine since 5/22/2011 and now it is getting computation errors. My other Ubuntu machines seem to be working fine.

Result Log

Result Name: HFCC_ target-7_ 00017205_ target-7_ 0000_ 0--
<core_client_version>6.10.59</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
INFO:[20:51:29] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[20:52:12] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Updating Best Energy for WU: -4.70
Finished Docking number 1
Updating Best Energy for WU: -5.07
Finished Docking number 2
Updating Best Energy for WU: -5.27
Finished Docking number 3
Updating Best Energy for WU: -5.45
Finished Docking number 4
Updating Best Energy for WU: -5.46
Finished Docking number 5
Finished Docking number 6
SIGSEGV: segmentation violation
Stack trace (13 frames):
[0x80b3d1b]
[0x811ab78]
[0xf77f3400]
[0x80522f2]
[0x807158c]
[0x8058dcb]
[0x8059361]
[0x804ac45]
[0x808c898]
[0x80a5d69]
[0x80a6be3]
[0x811cc7a]
[0x8048131]

Exiting...

</stderr_txt>
]]>
----------------------------------------



[Aug 15, 2011 2:14:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
mclaver
Veteran Cruncher
Joined: Dec 19, 2005
Post Count: 566
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

Don't exactly know, but it's without the minus sign, so it's something system related. "Failed to open either source or destination files while copying" suggests I/O problems. A CEP2 task has some 6600 files in the slot, so multiple that by the number of concurrent and you'll know the potential strain.

As I noted, my quad core Linux at stock 2.4Ghz, has 10 in cache, rest CW and the gives on average 3 concurrent. There's no hard guideline what can be done with each device. Many discussion have gone before. For sure visit the optimization sheet that cleanenergy published for best settings.

BTW, this project will run at least till end of next year... no need to let it scream, the computers that is. The desired number concurrent is easiest achieved by setting a very low cache, then set for instance 4 CEP2 for the Quad/HT device. 50% is a good number and with the number of cores at your disposal you'll be there in no time.

--//--


It looks like my problem may not be CEP2. I changed the Device profile for my AMD 1100 to not process CEP2. It looks like it is only running HFCC and they are all failing with computation error.

It is an AMD 1100 running UBUNTU 11.04, Boinc 6.10.50. It has been running fine since 5/22/2011 and now it is getting computation errors. My other Ubuntu machines seem to be working fine.

Result Log

Result Name: HFCC_ target-7_ 00017205_ target-7_ 0000_ 0--
<core_client_version>6.10.59</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
INFO:[20:51:29] Start AutoGrid...

autogrid: autogrid4: Successful Completion.
INFO:[20:52:12] End AutoGrid...
Beginning AutoDock...
INFO: Setting num_generations: 27000
_maxGenSeenSoFar changed: 6750
About to enter main loop...(dockings already completed: 0)
Updating Best Energy for WU: 0.00
Finished Docking number 0
Updating Best Energy for WU: -4.70
Finished Docking number 1
Updating Best Energy for WU: -5.07
Finished Docking number 2
Updating Best Energy for WU: -5.27
Finished Docking number 3
Updating Best Energy for WU: -5.45
Finished Docking number 4
Updating Best Energy for WU: -5.46
Finished Docking number 5
Finished Docking number 6
SIGSEGV: segmentation violation
Stack trace (13 frames):
[0x80b3d1b]
[0x811ab78]
[0xf77f3400]
[0x80522f2]
[0x807158c]
[0x8058dcb]
[0x8059361]
[0x804ac45]
[0x808c898]
[0x80a5d69]
[0x80a6be3]
[0x811cc7a]
[0x8048131]

Exiting...

</stderr_txt>
]]>
----------------------------------------



[Aug 15, 2011 1:01:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: getting a lot of 195 errors on Clean Energy

? Same post with an 11 hour interval. You can still delete the duplicate as I've given now reply to the one above your 2nd last.

Well, on the device being to blame... if out of nowhere there is 1 report and no-one else encounters it within half a day, posting on the forums, it's very likely a local thing. Diagnostics & de-dusting... the last thing first. Reseatting memory and plugs might already do the trick. Even intermittent keyboards can do the strangest things to computers.

--//--
[Aug 15, 2011 1:39:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 30   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread