Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Microbiome Immunity Project Thread: MIP units error on Linux |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 32
|
Author |
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Bound to fail if you have no graphics front i.e. it needs launching from the BOINC GUI so it knows which job(slot)/PID it is to look at. As stated, it's not supposed to tear down the main task if it graphics go down.
|
||
|
katoda
Senior Cruncher Poland Joined: Apr 28, 2007 Post Count: 170 Status: Offline Project Badges: |
Meh, I've added the missing library and arrived to the same type of error that was reported when running wcgrid_mdds_gfx_prod_linux_64.x86.7.08 (
----------------------------------------I think that without additional help from techs/scientists and more meaningful logs we would not be able to find out what causes MIP errors on some Linux machines. Taking into account that the error rate is low (therefore the issue is not the highest priority), I doubt if we can expect any investigation in the near future, as for sure WCG staff is busy with other, more important things. Pity, my Linux box is quite powerful and could contribute some nice crunching power to the project. [Edit 2 times, last edit by katoda at Aug 25, 2017 10:07:40 AM] |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
On me Android, if a job goes, it's 99.999% of the cases SIGSEGV, since the get go. 4 different VINA based sciences, and 1 going more often then the other, often middle of the night, when the tablet is truly just crunching on juice. (My Android carries a Notice asking me to help, very improbable for this one, at 650MB a pop in max memory needs, per job, not to speak of 50MB download max per job :(
Now there's SIGSEGV and your SIGSERV, which Google refuses to SERVe up, doubtlessly a typo. |
||
|
katoda
Senior Cruncher Poland Joined: Apr 28, 2007 Post Count: 170 Status: Offline Project Badges: |
doubtlessly a typo. Yep, a typo as big as the Eiffel Tower, thanks for pointing it out, already corrected :) |
||
|
Jean-David Beyer
Senior Cruncher USA Joined: Oct 2, 2007 Post Count: 335 Status: Offline Project Badges: |
I received two work units and they completed correctly on my Dell T7600 machine (one 4-core 64-bit processor installed and 8 GBytes RAM) running
----------------------------------------Red Hat Enterprise Linux Server release 6.9 (Santiago) up-to-date as of yesterday. |
||
|
RTorpey
Advanced Cruncher Joined: Aug 24, 2005 Post Count: 67 Status: Offline Project Badges: |
So far, I've processed about 250 wu on Ubuntu and 50 on Centos 6 without any errors.
|
||
|
katoda
Senior Cruncher Poland Joined: Apr 28, 2007 Post Count: 170 Status: Offline Project Badges: |
I'm not surprised that there are several people happily crunching on Linux - it does not seem like a general problem, rather a very rare and specific issue, linked with my (and a few other persons) system setup. It could be the kernel version, incompatible or missing libraries, too tight system security settings, anything. Without any hint what to do and where to look it would be pretty difficult to narrow and eliminate the issue.
----------------------------------------For the moment I plan to check from time to time with one or two MIP workunits if the issue persists and crunch other WCG projects with my Linux box. [Edit 1 times, last edit by katoda at Sep 3, 2017 1:09:07 AM] |
||
|
ChristianVirtual
Advanced Cruncher Japan Joined: Jan 11, 2014 Post Count: 55 Status: Offline Project Badges: |
Got four WU on my CentOS-73-64-minimal/Intel i7-6700 and no issues;
----------------------------------------Many more on Ubuntu 17.4/Ryzen
Active with WCG, GPUGrid, F@H
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Received this error:
----------------------------------------MIP1_ 00003361_ 1915_ 0-- <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> [2017- 9-12 5: 7:59:] :: BOINC:: Initializing ... ok. [2017- 9-12 5: 7:59:] :: BOINC :: boinc_init() INFO: result number = 0 BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: ../../projects/www.worldcommunitygrid.org/wcgrid_mip1_rosetta_7.11_x86_64-pc-linux-gnu -in::file::zip MIP1_databasev2.zip @./MIP1_00003361.flags -out::file::silent result_silent.out -run:jran 838417428 -nstruct 10 -out::level 100 -run::no_scorefile true Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/www.worldcommunitygrid.org/mip1.MIP1_databasev2.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... set_shared_memory_fully_initialized ... abrelax ... abrelax.run Setting up folding (abrelax) ... Beginning folding (abrelax) ... BOINC:: Worker startup. Sequence Length = 126 Starting work on structure: _0001 std::cerr: Exception was thrown: Cannot normalize xyzVector of length() zero </stderr_txt> ]]> And this error: MIP1_ 00003528_ 0610_ 0-- <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> process exited with code 1 (0x1, -255) </message> <stderr_txt> [2017- 9-13 10:33:26:] :: BOINC:: Initializing ... ok. [2017- 9-13 10:33:26:] :: BOINC :: boinc_init() INFO: result number = 0 BOINC:: Setting up shared resources ... ok. BOINC:: Setting up semaphores ... ok. BOINC:: Updating status ... ok. BOINC:: Registering timer callback... ok. BOINC:: Worker initialized successfully. command: ../../projects/www.worldcommunitygrid.org/wcgrid_mip1_rosetta_7.11_x86_64-pc-linux-gnu -in::file::zip MIP1_databasev2.zip @./MIP1_00003528.flags -out::file::silent result_silent.out -run:jran 514591630 -nstruct 9 -out::level 100 -run::no_scorefile true Registering options.. Registered extra options. Initializing broker options ... Registered extra options. Initializing core... Initializing options.... ok Options::initialize() Options::adding_options() Options::initialize() Check specs. Options::initialize() End reached Loaded options.... ok Processed options.... ok Initializing random generators... ok Initialization complete. Setting WU description ... Unpacking zip data: ../../projects/www.worldcommunitygrid.org/mip1.MIP1_databasev2.zip Setting database description ... Setting up checkpointing ... Setting up graphics native ... set_shared_memory_fully_initialized ... abrelax ... abrelax.run Setting up folding (abrelax) ... ERROR: bad format for cen_rot_pair_ang_params.txt ERROR:: Exit from: src/core/scoring/CenRotEnvPairPotential.cc line: 232 [0x2e6a9e1] [0x445f56d] [0x3151061] [0x31251f5] [0x33ee3f7] [0x3128519] [0x30ef686] [0x30f0a97] [0x30f2600] [0x311123a] [0x3112c1b] [0xfccc59] [0xfdc519] [0xfc65f8] [0xfa9616] [0xfb17c2] [0x411ed4] [0x4795614] [0x4795746] [0x9658e6] BOINC:: Error reading and gzipping output datafile: default.out 10:33:31 (3669): called boinc_finish(1) </stderr_txt> ]]> Both on Ubuntu 17.04 [Edit 1 times, last edit by Doneske at Sep 13, 2017 8:45:46 PM] |
||
|
PowerFactor
Ace Cruncher Joined: Dec 9, 2016 Post Count: 4016 Status: Offline Project Badges: |
I have 3 computers running WCG on Ubuntu 17.04 minimal install and I haven't had any MIP work unit problems yet. My computers have crunched 78 MIP WU's collectively.
----------------------------------------[Edit 1 times, last edit by thepeacemaker7 at Sep 14, 2017 12:10:05 AM] |
||
|
|