| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 5
|
|
| Author |
|
|
[AF>Amis des Lapins] Oncle Bob
Cruncher Joined: Apr 20, 2013 Post Count: 5 Status: Offline Project Badges:
|
Hello,
----------------------------------------Since a few days, all my CEP2 are going into error after a few times. I'm running Linux Mint 17.1 with 7.6.2 client. Here is one log : <core_client_version>7.6.2</core_client_version> <![CDATA[ <message> process got signal 11 </message> <stderr_txt> INFO: No state to restore. Start from the beginning. [01:53:56] Number of jobs = 8 [01:53:56] Starting job 0,CPU time has been restored to 0.000000. [01:53:56] Starting new Job [01:53:56] Qink name = fldman [01:53:57] Qink name = gesman [01:53:59] Qink name = scfman Parent was killed, exiting </stderr_txt> ]]> Other project (Mindmodeling...) seem to be ok. I'll try some other WCG sub-projects. ![]() |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7844 Status: Offline Project Badges:
|
The process got signal 11 means your computer is basically too busy. If you are running 8 concurrent CEP2 tasks you have probably overloaded your disk I/O causing this error. Try cutting back on CEP2 to fewer tasks, maybe 5 or 6 CEP2 and 2 or 3 something else. CEP2 can be very disk intensive.
----------------------------------------Cheers
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Jun 4, 2015 3:00:12 AM] |
||
|
|
[AF>Amis des Lapins] Oncle Bob
Cruncher Joined: Apr 20, 2013 Post Count: 5 Status: Offline Project Badges:
|
Hum, it's a bit weird, I run CEP2 on the 24 threads for weeks (month ?).
----------------------------------------I'll check the HDD health. Thanks for your help. ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
24 threads concurrent... yup that is a suspected root cause. Particularly the startup phase, when it unpacks about 6700 reference files and copies them to the task slot and constructs the model from the given parms is highly demanding. With BOINCTasks [3rd party multi client manager], you can observe the Elapsed time and CPU time side by side Just with a few concurrent on my 4770 I see minutes pass before actual computing begins. The more start concurrent the tougher it becomes for storage to keep up. If you'd be in a position to config a RAMdrive of say 24-30GB, and run BOINC off that, you'd have a winner [UPS helps to not lose anything if your area has wobbly grid power]
Recommended is also a BOINC exclusive partition on HD/SSD, ideally an exclusive drive all together with large caching capacity. Anyway, there are many threads on the do's and don'ts in this forum how to optimize ["Leave Application in Memory when suspected" a must take option in the BOINC preferences]. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
My relatively new-ish SSD can handle 8 CEP2 WUs at the same time 24/7. Having said that, beyond that I don't know how things could/will go.
|
||
|
|
|