Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 15
Posts: 15   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4406 times and has 14 replies Next Thread
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Anyone else seeing signal 11 with MIP WU? (Autumn 2020)

Also seeing this on an Intel i5 chip — since people are mentioning AMD — while ARP1, HST1, OPN1 are running smoothly together (without MIP1 disturbing the peace) on that laptop, while it's running Fedora (Linux) 33, which is the most recent version.

Been seeing this since 2020-11-27. Stopped acquiring MIP1 that same day on that device.
I thought trying my luck today again, but to no avail. Maybe there's insufficient memory … (8 threads, 6 MB RAM, 6 MB L3-cache).

App    CpuTime Elapsed Claimed Granted ModTime    Exit Outc SentTime            ReceivedTime        Name
mip1 1.71 1.84 33.9 33.9 1600711070 0 1 2020-09-21T13:49:09 2020-09-21T17:57:44 MIP1_00320723_1625_0
mip1 2.79 3.05 57.6 57.6 1600722496 0 1 2020-09-21T16:01:26 2020-09-21T21:08:12 MIP1_00320308_7149_1
mip1 2.71 2.80 52.7 52.7 1600733281 0 1 2020-09-21T17:57:44 2020-09-22T00:07:57 MIP1_00320663_4852_0
mip1 3.51 3.62 69.7 69.7 1600746264 0 1 2020-09-21T21:08:12 2020-09-22T03:44:19 MIP1_00319276_0073_1
mip1 2.44 2.52 48.0 48.0 1600755392 0 1 2020-09-22T00:07:57 2020-09-22T06:16:28 MIP1_00320496_5182_0
mip1 1.01 1.05 20.1 20.1 1600759257 0 1 2020-09-22T03:44:19 2020-09-22T07:20:51 MIP1_00320656_0290_0
mip1 2.21 2.30 45.7 45.7 1600767621 0 1 2020-09-22T06:16:28 2020-09-22T09:40:14 MIP1_00319269_5910_1
mip1 3.28 3.49 70.6 70.6 1600780706 0 1 2020-09-22T07:20:51 2020-09-22T13:18:21 MIP1_00320663_3566_0
mip1 3.45 3.66 73.8 73.8 1600793958 0 1 2020-09-22T09:40:14 2020-09-22T16:59:11 MIP1_00320698_1593_0
mip1 3.47 3.75 74.5 74.5 1600807475 0 1 2020-09-22T13:18:22 2020-09-22T20:44:28 MIP1_00320484_0012_0
mip1 2.09 2.30 43.0 43.0 1601036930 0 1 2020-09-24T08:19:51 2020-09-25T12:28:47 MIP1_00320757_0634_0
mip1 0.78 0.79 23.3 0.0 1606477133 139 3 2020-11-24T01:08:08 2020-11-27T11:38:53 MIP1_00325397_0550_2
mip1 0.73 0.74 21.9 0.0 1606481934 139 3 2020-11-27T11:38:53 2020-11-27T12:58:54 MIP1_00325683_3029_0
mip1 0.73 0.74 22.0 0.0 1606485186 139 3 2020-11-27T12:58:54 2020-11-27T13:53:06 MIP1_00325875_0224_0
mip1 0.33 0.33 9.8 0.0 1606486856 139 3 2020-11-27T13:53:06 2020-11-27T14:20:56 MIP1_00325656_0450_0
mip1 0.46 0.47 13.9 0.0 1606492743 139 3 2020-11-27T15:18:07 2020-11-27T15:59:03 MIP1_00325700_2745_0
mip1 0.27 0.28 8.2 0.0 1606489392 139 3 2020-11-27T14:20:56 2020-11-27T15:03:12 MIP1_00325557_1154_0
mip1 0.15 0.15 4.4 0.0 1606490286 139 3 2020-11-27T15:03:12 2020-11-27T15:18:06 MIP1_00325980_0700_0
mip1 0.74 0.75 22.1 0.0 1606495526 139 3 2020-11-27T15:59:03 2020-11-27T16:45:26 MIP1_00325922_3436_0
mip1 0.20 0.20 5.9 0.0 1606496337 139 3 2020-11-27T16:45:26 2020-11-27T16:58:57 MIP1_00325849_0586_0
mip1 0.50 0.51 15.2 0.0 1606498718 139 3 2020-11-27T16:58:57 2020-11-27T17:38:38 MIP1_00325742_0523_0
mip1 0.74 0.76 22.6 0.0 1606658732 139 3 2020-11-27T17:38:38 2020-11-29T14:05:32 MIP1_00325628_2929_0
mip1 2.29 2.32 31.3 0.0 1610524034 139 3 2021-01-11T14:53:36 2021-01-13T07:47:14 MIP1_00328169_3759_0
mip1 1.14 1.15 15.5 0.0 1610528274 139 3 2021-01-13T07:47:14 2021-01-13T08:57:54 MIP1_00328234_8793_0
mip1 1.32 1.34 17.9 0.0 1610536408 139 3 2021-01-13T08:57:54 2021-01-13T11:13:28 MIP1_00328230_11045_0
App CpuTime Elapsed Claimed Granted ModTime Exit Outc SentTime ReceivedTime Name
n=11 CpuHours=28.68 Hrs/n=2.607

Project Name: Microbiome Immunity Project
Created: 01/09/2021 17:00:05
Name: MIP1_00328169_3759
Minimum Quorum: 1
Replication: 1
MIP1_00328169_3759_1-- Darwin 716 Valid 1/13/21 07:48:50 1/13/21 14:09:34 1.29 59.6 / 59.6
<core_client_version>7.16.14</core_client_version>
<![CDATA[
<stderr_txt>
[2021- 1-13 21:24: 0:] :: BOINC:: Initializing ... ok.
[2021- 1-13 21:24: 0:] :: BOINC :: boinc_init()
INFO: result number = 1
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
command: wcgrid_mip1_rosetta_7.16_x86_64-apple-darwin -in::file::zip MIP1_databasev2.zip @./MIP1_00328169.flags -out::file::silent result_silent.out -run:jran 1620261810 -nstruct 1 -out::level 100 -run::no_scorefile true
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/www.worldcommunitygrid.org/mip1.MIP1_databasev2.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
set_shared_memory_fully_initialized ...
abrelax ...
abrelax.run
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Sequence Length = 467
Starting work on structure: _0001
Finished _0001 in 4614.59 seconds.
======================================================
DONE :: 1 structures in 4625.82 cpu seconds
======================================================
BOINC :: BOINC support services shutting down cleanly ...
22:44:38 (80464): called boinc_finish(0)

</stderr_txt>
]]>
MIP1_00328169_3759_0-- Linux Fedora 716 Error 1/11/21 14:53:36 1/13/21 07:47:14 2.29 31.3 / 0.0
<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
[2021- 1-11 18:21:19:] :: BOINC:: Initializing ... ok.
[2021- 1-11 18:21:19:] :: BOINC :: boinc_init()
INFO: result number = 0
BOINC:: Setting up shared resources ... ok.
BOINC:: Setting up semaphores ... ok.
BOINC:: Updating status ... ok.
BOINC:: Registering timer callback... ok.
BOINC:: Worker initialized successfully.
command: ../../projects/www.worldcommunitygrid.org/wcgrid_mip1_rosetta_7.16_x86_64-pc-linux-gnu -in::file::zip MIP1_databasev2.zip @./MIP1_00328169.flags -out::file::silent result_silent.out -run:jran 1620261810 -nstruct 1 -out::level 100 -run::no_scorefile true
Registering options..
Registered extra options.
Initializing broker options ...
Registered extra options.
Initializing core...
Initializing options.... ok
Options::initialize()
Options::adding_options()
Options::initialize() Check specs.
Options::initialize() End reached
Loaded options.... ok
Processed options.... ok
Initializing random generators... ok
Initialization complete.
Setting WU description ...
Unpacking zip data: ../../projects/www.worldcommunitygrid.org/mip1.MIP1_databasev2.zip
Setting database description ...
Setting up checkpointing ...
Setting up graphics native ...
set_shared_memory_fully_initialized ...
abrelax ...
abrelax.run
Setting up folding (abrelax) ...
Beginning folding (abrelax) ...
BOINC:: Worker startup.
Sequence Length = 467
Starting work on structure: _0001

</stderr_txt>
]]>

PS- I have no problem dropping MIP1 from that device, 'cause it has been merrily crunching away on ARP1, HST1 and OPN1 since … well, since these three exist.
[Jan 13, 2021 4:52:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sterl
Cruncher
Joined: Jul 23, 2013
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Anyone else seeing signal 11 with MIP WU? (Autumn 2020)

I have noticed similar occurrences. I have an older AMD cpu, running Linux Mint 20.1. Only processing 2 WU at one time.Some units work fine, others do not.

There seems to be 2 issues, for those who do not;
1/ Some units do not pickup where they left off after a reboot, it just starts processing at 0hrs,0mins

2/ The other issue is when a unit is processing and after several hours the remaining time just keeps increasing. The only way to overcome this is a reboot and that work unit restarts at 0hrs,0mins.

I am not sure if this is the same on other projects, maybe I will give it a try on others for a few days and see how it goes
[Jan 28, 2021 8:20:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Bryn Mawr
Senior Cruncher
Joined: Dec 26, 2018
Post Count: 385
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Anyone else seeing signal 11 with MIP WU? (Autumn 2020)

I have noticed similar occurrences. I have an older AMD cpu, running Linux Mint 20.1. Only processing 2 WU at one time.Some units work fine, others do not.

There seems to be 2 issues, for those who do not;
1/ Some units do not pickup where they left off after a reboot, it just starts processing at 0hrs,0mins

2/ The other issue is when a unit is processing and after several hours the remaining time just keeps increasing. The only way to overcome this is a reboot and that work unit restarts at 0hrs,0mins.

I am not sure if this is the same on other projects, maybe I will give it a try on others for a few days and see how it goes


I have seen both of these symptoms in Rosetta where WUs only checkpoint at the end of a “decoy”.

Occasionally you get a WU where the first decoy runs for a very long time. Any restart whilst it is running causes it to begin at the beguine and when it gets to within 10 minutes of the estimated time the remaining time just sticks at 10 minutes because it just does not know how long the decoy will take and therefore what percentage is actually complete.

On the other hand, if the remaining time is actively increasing rather than staying stable could it be in a loop?
[Jan 28, 2021 9:49:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7847
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Anyone else seeing signal 11 with MIP WU? (Autumn 2020)

8 threads, 6 MB RAM, 6 MB L3-cache

I have a feeling you meant 6gb RAM.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Jan 28, 2021 3:18:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Anyone else seeing signal 11 with MIP WU? (Autumn 2020)

8 threads, 6 MB RAM, 6 MB L3-cache

I have a feeling you meant 6gb RAM.
Cheers

Of course it is 6 GB RAM! biggrin
Thanks very much for spotting this error in my documentation that I copied to the forum here, Sgt.Joe! rose
[Jan 28, 2021 6:59:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 15   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread