| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 20
|
|
| Author |
|
|
rbm73
Cruncher USA Joined: Apr 1, 2011 Post Count: 28 Status: Offline Project Badges:
|
I am running mint-linux 17.3 on an old 32-bit laptop - not a lot of processing power, but it is dedicated to WCG and does produce results.
Except all MIP tasks end in error: received signal 11 results files look like this: <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> process got signal 11 </message> <stderr_txt> [2018- 1- 2 1:24:11:] :: BOINC:: Initializing ... ok. [2018- 1- 2 1:24:11:] :: BOINC :: boinc_init() INFO: result number = 0 BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: ../../projects/www.worldcommunitygrid.org/wcgrid_mip1_rosetta_7.11_i686-pc-linux-gnu -in::file::zip MIP1_databasev2.zip @./MIP1_00041671.flags -out::file::silent result_silent.out -run:jran 1858648814 -nstruct 18 -out::level 100 -run::no_scorefile true Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Unpacking zip data: ../../projects/www.worldcommunitygrid.org/mip1.MIP1_databasev2.zip Setting database description ... Setting up checkpointing ... abrelax ... abrelax.run Setting up folding (abrelax) ... FoldContraints Constructer set_default_options In set_default_options parent::set_default_options ClassicAbinitio::set_default_options just_smooth_cycles bQuickTest set_cycles increas_cycles ClassicAbinitio::set_cycles 10 </stderr_txt> ]]> Is this a 32 vs 64 bit problem, or something else? Thanks. |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Recently Active Project Badges:
|
Signal 11 means your application is probably accessing memory which is not assigned to it. Run a memory check. Might be caused by bad memory. There are lots of explanations if you search the internet for Signal 11 errors.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
rbm73
Cruncher USA Joined: Apr 1, 2011 Post Count: 28 Status: Offline Project Badges:
|
Is your answer suppose to be informative, or just a "put down"?
I have over 3 years run time and over 5700 results returned on this machine and NO OTHER PROJECT has generated a Signal 11 error! I posted on the Microbiome Immunity Project forum because I am looking for a problem with that projects distributed tasks. If I wanted or needed a tutorial about linux or Signal 11 errors I would have found a linux forum. SO, what help can this forum provide to solve the problem with MIP tasking and 32-bit Mint Linux 17.3? Need I repeat that no other WCG projects run on this hardware have encountered a single instance of this problem? BTW, memtest "likes" my Toshiba laptop memory... |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Recently Active Project Badges:
|
Is your answer suppose to be informative, or just a "put down"? My answer was in no way intended as a put down. It was intended to be helpful. If you did not find it helpful, I'm sorry. I have over 4 years and 15,000 results on MIP using 64 bit Linux Mint in several different iterations. I have not experienced any "signal 11" problems. (Windows gives errors if BOINC is not exited before a shutdown, but that is a different story.) A perusal of potential answers to your question generally points to two main sources of the problem. One is the aforementioned memory problem and another points to a programming problem. If this was a programming problem, I would suspect numerous individuals would have experienced it. Since this is not the case, I would suspect some unknown hardware problem with your system. If all the memory tested well, that may not be the source of the problem. It may be a combination of factors all working in conjunction with each other that cause your system to only choke on this one project. (I have one system which can not run Help Stop TB because it gets a fault which cases the system to reboot. So, I have eliminated that project from that machine. It runs all other projects flawlessly. Why ? I don't know, but wish I did as all of the hardware seems to check out OK.) Perhaps someone with more technical knowledge could help you diagnose your problem. Good luck. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
andgra
Senior Cruncher Sweden Joined: Mar 15, 2014 Post Count: 195 Status: Offline Project Badges:
|
I'm tempted to agree with Sgt.Joe here.
----------------------------------------As MIP require a quite high amount of RAM when executing it could very well be a memory problem. I just switched to MIP after a long SCC run (my androids will have to take the last days to a 50y badge) and I have noticed heavy memory use, bad Linux performance, lower CPU temps and higher memory temps. Especially on an old dual Xeon HP Proliant rack server with plenty of sensors here and there. My RAM temps on that machine is more than 10 degrees Celsius higher than crunching SCC. Could you try and run some intensive memory testing software to try and rule this out?
/andgra
![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm running MIP on an old 32 bit laptop as well but with no problems. However, I'm running it on Windows 7 Pro.
|
||
|
|
rbm73
Cruncher USA Joined: Apr 1, 2011 Post Count: 28 Status: Offline Project Badges:
|
Following your suggestion, I have finished running 15+ hours straight of memory test without a single error. Seems doubtful there is a memory issue.
----------------------------------------I am not sure what else to try - perhaps several days of memory testing. I guess I can look at that after current downloads finish. Unfortunately, there is very little available to monitor temperature on this laptop - only hdd sensor was found, so I can't tell much about the memory temp. [Edit 1 times, last edit by rbm73 at Jan 11, 2018 7:19:43 PM] |
||
|
|
andgra
Senior Cruncher Sweden Joined: Mar 15, 2014 Post Count: 195 Status: Offline Project Badges:
|
Ok, maybe we can rule that out then.
----------------------------------------Do you run more than one thread? An idea would be to reduce the amount. Could you also post the initial from the log as Boinc is started so we get a better feel for your HW?
/andgra
![]() |
||
|
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges:
|
Well, I'm not a computer science guy, but 2min with Google got me this:
----------------------------------------- https://setiathome.berkeley.edu/forum_thread.php?id=43317 & this - https://boinc.bakerlab.org/forum_thread.php?id=3702 So reset the project in BOINC & start from scratch. ;) |
||
|
|
rbm73
Cruncher USA Joined: Apr 1, 2011 Post Count: 28 Status: Offline Project Badges:
|
I have just upgraded the linux kernel - normal security update as far as I know; hopefully related to intel chip fixes. Did a shutdown/restart and a reset on the WCG project. Haven't been offered any MIP tasks so I guess I will have to specifically request them. Usually I get them even if they are not part of my project list. More later...
|
||
|
|
|