Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 5
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1642 times and has 4 replies Next Thread
[AF>Amis des Lapins] Oncle Bob
Cruncher
Joined: Apr 20, 2013
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
CEP2 - error on linux since a few days

Hello,

Since a few days, all my CEP2 are going into error after a few times. I'm running Linux Mint 17.1 with 7.6.2 client.

Here is one log :

<core_client_version>7.6.2</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[01:53:56] Number of jobs = 8
[01:53:56] Starting job 0,CPU time has been restored to 0.000000.
[01:53:56] Starting new Job
[01:53:56] Qink name = fldman
[01:53:57] Qink name = gesman
[01:53:59] Qink name = scfman
Parent was killed, exiting

</stderr_txt>
]]>



Other project (Mindmodeling...) seem to be ok.

I'll try some other WCG sub-projects.
----------------------------------------

[Jun 4, 2015 1:15:42 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7844
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 - error on linux since a few days

The process got signal 11 means your computer is basically too busy. If you are running 8 concurrent CEP2 tasks you have probably overloaded your disk I/O causing this error. Try cutting back on CEP2 to fewer tasks, maybe 5 or 6 CEP2 and 2 or 3 something else. CEP2 can be very disk intensive.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Jun 4, 2015 3:00:12 AM]
[Jun 4, 2015 2:59:16 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[AF>Amis des Lapins] Oncle Bob
Cruncher
Joined: Apr 20, 2013
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 - error on linux since a few days

Hum, it's a bit weird, I run CEP2 on the 24 threads for weeks (month ?).

I'll check the HDD health. Thanks for your help.
----------------------------------------

[Jun 4, 2015 12:02:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 - error on linux since a few days

24 threads concurrent... yup that is a suspected root cause. Particularly the startup phase, when it unpacks about 6700 reference files and copies them to the task slot and constructs the model from the given parms is highly demanding. With BOINCTasks [3rd party multi client manager], you can observe the Elapsed time and CPU time side by side Just with a few concurrent on my 4770 I see minutes pass before actual computing begins. The more start concurrent the tougher it becomes for storage to keep up. If you'd be in a position to config a RAMdrive of say 24-30GB, and run BOINC off that, you'd have a winner [UPS helps to not lose anything if your area has wobbly grid power]

Recommended is also a BOINC exclusive partition on HD/SSD, ideally an exclusive drive all together with large caching capacity.

Anyway, there are many threads on the do's and don'ts in this forum how to optimize ["Leave Application in Memory when suspected" a must take option in the BOINC preferences].
[Jun 4, 2015 12:33:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 - error on linux since a few days

My relatively new-ish SSD can handle 8 CEP2 WUs at the same time 24/7. Having said that, beyond that I don't know how things could/will go.
[Jun 4, 2015 12:59:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread