Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 16
|
![]() |
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm currently running a Gentoo system on a system that otherwise seems to be rock stable. However, trying to run WGC via BOINC seems to die after a while, I seem to end up with a lot of errors, like:
2006-01-08 09:32:14 [World Community Grid] Resuming computation for result ed997_19_5 using rosetta version 421 2006-01-08 09:42:02 [World Community Grid] Unrecoverable error for result ed997_19_5 (process exited with code 131 (0x83)) 2006-01-08 09:42:02 [World Community Grid] Unrecoverable error for result ed997_19_5 (process exited with code 131 (0x83)) 2006-01-08 09:42:02 [---] request_reschedule_cpus: process exited as well as 2006-01-08 08:11:47 [World Community Grid] Started download of el002_15_el002.psipred SIGSEGV: segmentation violationStack trace (3 frames): ./boinc[0x80845b2] /lib/libpthread.so.0[0x401635d9] /lib/libc.so.6[0x4004ae38] Exiting... Can anyone suggest what the heck is going on here and how I can fix it? |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Knreed should be along to help with this. He's the Linux/WCG guru.
----------------------------------------[Edit 1 times, last edit by Former Member at Jan 9, 2006 12:43:34 PM] |
||
|
Alther
Former World Community Grid Tech United States of America Joined: Sep 30, 2004 Post Count: 414 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
What is your application stack limit?
----------------------------------------You can find out by typing: ulimit -s
Rick Alther
Former World Community Grid Developer |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The application stack limit is currently set to 8192.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Here is what I found for SIGSEGV: http://www.wlug.org.nz/SIGSEGV
So I guess the place to start is with Resuming computation . . . How are your BOINC preferences set up? Do you suspend programs and keep them in memory? Are you running multiple projects? Also, could you describe your computer system? RAM, virtual memory partition? If you run ulimit -a to determine all your current limits, do they look reasonable (no surprises)? Reading other bulletin boards, people run into various bugs when resuming projects. So my main thrust here is to come up with a way to run BOINC projects on your computer that works, even though I do not know just what the bugs are. ![]() mycrofth |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The machine is a Dual Xeon with 2GB RAM with a 2GB swap partition. Processes are not normally suspended (the machine acts as a server), and there's only one instance of BOINC running - hence only a single project at a time. ulimit settings look reasonable as far as I can tell.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I haven't attempted to run BOINC with any other projects other than WCG - one step at a time please! ;)
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
???
Here is the list of error codes: http://boinc-doc.net/boinc-wiki/index.php?title=Error_Code But it leaves me perplexed. The only idea I come up with is to wonder if the CPU temperature is reasonable. Does anybody have any ideas? mycrofth |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The machine is a Dell PowerEdge server with a remote monitoring card (DRAC). The card indicates that the temperatures, fans and voltages are all within their normal tolerances.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have since realized that I should have been looking at the 'Unrecoverable Error' messages at http://boinc-doc.net/boinc-wiki/index.php?title=Category:BOINC_Error_Message but that does not help since 0x81 is not listed.
|
||
|
|
![]() |