| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 9
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Most of the WCG tasks my home server runs fail, losing several hours of work each time. The error message is either signal 11 or too many exit(0)s. The host is running Ubuntu server 13.04 x86_64 with 32-bit libraries installed for compatibility. HW specs: Atom D525 @ 1.8GHz (2 cores, HT), 2GB RAM. Memory and disk are both fine and all other tasks (LHC, Einstein, SETI, Milkyway, Rosetta) finish as they're supposed to.
Should I just detach the host from WCG and spend the time I would lose on erroneous units on other projects or is there something I can do to fix this problem? |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
Hi,
----------------------------------------What WCG project are you running? ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hey,
Thanks for the quick reply! I run any tasks WCG has to offer. The host has gotten and failed tasks from Human Proteome Folding, Clean Energy Project and Drug Search for Leishmaniasis. |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
According to this FAQ it could be memory problems. Try running Memtest86. Also, I believe CEP2 may cause this error due to it's I/O activity. So yeah, check your memory and deselect CEP2.
----------------------------------------![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Found the same FAQ and ran several passes of memtest86+ yesterday, no errors. I've also checked the 'Leave applications in memory while suspended' option as suggested. I'll try deselecting CEP2 and see if other tasks start returning normally.
Could CPU throttling have anything to do with bad results? I have it set to 65% at the moment. |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
I believe so. Set it to 100%.
----------------------------------------Also, if the "Suspend work if CPU usage is above XX % of cpu" is enabled set it to 0%. ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
In order to keep the temperatures at bay and leave Apache and Samba some room to operate, should I then decrease the number of cores BOINC can use?
|
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
Maybe. For Windows there is TThrottle which is better for managing the CPU(and GPU) temperatures. Don't know of any alternative for Linux. As for Apache and Samba, BOINC runs in idle mode, so if other applications need CPU power, BOINC will release as requested by those other applications.
----------------------------------------![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks for all the information! I removed the throttling and suspending and limited BOINC to 3 out of 4 available cores and will see what happens. I'll post a follow-up once it completes a couple of WUs.
----------------------------------------Edit: The changes seemed to fix it, at least the first CE2 unit after the changes was valid and so was a unit of DSFL. [Edit 2 times, last edit by Former Member at Jun 8, 2013 2:50:45 PM] |
||
|
|
|