Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 13
Posts: 13   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4781 times and has 12 replies Next Thread
bWildered1
Cruncher
Joined: Oct 3, 2020
Post Count: 6
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Computation Error - Insufficient RAM?

Hi,
I have an Orange Pi One 4-core 64-bit ARM with 512MB RAM Debian 9 running the World Community Grid via BOINC.
This can run BOINC projects such as Universe@Home using all 4 cores, but can only run 3 Covid-19 tasks simultaneously, although it can run 3 Covid-19 tasks along with a single Universe@Home task to employ all 4 cores. When a 4th Covid-19 task starts it immediately produces a 'Computation Error'.

I suspect there is insufficient memory (RAM) to run 4 simultaneous Covid-19 tasks.

To be clear, the error only happens when trying to start a 4th Covid-19 task with 3 already running.

Does the Covid-19 project - or the WCM framework - check for available RAM before starting a new task?

Take care all, and keep safe.

M
[Nov 29, 2020 3:56:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Seth Karlinsey
Cruncher
Joined: Apr 19, 2020
Post Count: 15
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Can you pastern the output from the task? You can get it from result status > error. I have a few low ram hosts, BOINC should wait on memory when you are low.
----------------------------------------


[Nov 29, 2020 6:02:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bWildered1
Cruncher
Joined: Oct 3, 2020
Post Count: 6
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Task 'A'
Result Log

Result Name: OPN1_ 0023822_ 01105_ 0--
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>

</stderr_txt>
]]>

Task 'B'
Result Log

Result Name: OPN1_ 0023822_ 01451_ 1--
<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
process got signal 11
</message>
<stderr_txt>

</stderr_txt>
]]>
[Nov 29, 2020 7:15:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Composer
Cruncher
Joined: May 28, 2014
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

If I remember correctly (and it is quite possible that I do not) 'process got signal 11' means that the process has received an instruction to stop computation and exit, but does not provide much more detail than that. Its sort of a catch all thing that covers everything not covered by other stuff. Again, I could be totally wrong with this, so if I am someone please feel free to correct me.
[Dec 11, 2020 3:22:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bWildered1
Cruncher
Joined: Oct 3, 2020
Post Count: 6
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Hi Composer,

this link suggests it is likely to be a memory related error
https://serverfault.com/questions/67504/what-can-cause-a-signal-11

Incidentally there was about 98k RAM free (of 512k total) just before the 4th Covid-19 task started.

That seems to me to be low, too low to start another Covid-19 task, because the other 3 Covid-19 tasks and the other Linux processes are consuming 414k between them.

With 4 Universe@Home tasks running the available free memory is reported as 341k (of 512k).

This hints at a memory problem.

Take care and keep safe

Mike
[Dec 11, 2020 4:00:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Martin Schnellinger
Advanced Cruncher
Joined: Apr 29, 2007
Post Count: 128
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Hello,
I must admit that I am not a real computer and linux specialist.
But I try to help you without warranty.

Let us assume, that it is a memory problem indeed.

There are two variables in BOINC settings.
One decides, how much memory can be used if the machine is busy.
The other decides, how much memory can be used if the machine is not busy.

Please set both variables to 100 per cent.
Additionally, please mark: Keep program in memory when paused.

Please note: I use the German language version, so I had to translate everything back into English. The descriptions in the BOINC settings might slightly differ form the words I have chosen.

Please give me feedback, whether the changes I proposed solved the problem or not.
Thank you.
M
----------------------------------------
[Edit 1 times, last edit by Martin Schnellinger at Dec 11, 2020 5:45:52 PM]
[Dec 11, 2020 5:40:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bWildered1
Cruncher
Joined: Oct 3, 2020
Post Count: 6
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Hi Martin,

Thanks for the information.

For the purposes of records, the original BOINC memory settings - Options/Disk and Memory/Memory - were at 50% for 'When computer is in use...' and 90% for 'When computer is not in use,...' and the 'leave non-GPU tasks in memory while suspended' option was NOT selected.

I have now made the changes you suggested.

I have a number of BOINC tasks queued and will wait for these to complete before conducting any further tests; this will take about 4 days.

A silly question...
Should the 'Leave non-GPU tasks in memory while suspended' option really be selected?
Would this not interfere with the normal operation of the system by preventing the OS from using this RAM when it might need it most?
Would the OS not shuttle the contents of this memory to swap and reload when it can?

Thanks for the help Martin.

Take care and keep safe

Mike
[Dec 11, 2020 7:07:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Martin Schnellinger
Advanced Cruncher
Joined: Apr 29, 2007
Post Count: 128
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Hello Mike,
I have chosen the option to leave the tasks in memory for years now (and this on
a fairly old computer and nothing bad happend.
There should be no reason to worry.
I whish you success and that the problem disappears.
Stay safe in these strange times, too.
Best whishes
Martin
[Dec 11, 2020 7:22:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bWildered1
Cruncher
Joined: Oct 3, 2020
Post Count: 6
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Here is the result of the Dublin jury...

Starting Condition: BOINC no tasks (work units), clean start

Enable 1 of 4 CPUs
Acquire Covid-19 work units - 8 work units [wu] acquired and 1 wu automatically started
Enable 2 of 4 CPUs - 2nd wu starts OK
Enable 3 of 4 CPUs - 3rd wu starts OK
Enable 4 of 4 CPUs - 4th wu starts OK

Check back regularly -> 4 wu running OK

Check back next day -> 2 wu's have failed (process got signal 11). Looks like 1 work unit failed, BOINC started next wu in queue and that immediately failed too (a little after 6AM).

All but one (1) of the original four (4) work units has processed to completion; the one remaining wu is almost complete - should be finished around 21:00 GMT today.

The suggestions made by Martin seem to have helped.
The system can now run 4 Covid-19 work units simultaneously at least some of the time.
But not all of the time.

This hints that there may be a boundary (edge or threshold) issue, and that the system (BOINC/Covid-19 app) does not handle the issue properly.

But then, what do I know...

Take care all

M
----------------------------------------
[Edit 1 times, last edit by bWildered1 at Dec 16, 2020 5:06:11 PM]
[Dec 16, 2020 3:46:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Martin Schnellinger
Advanced Cruncher
Joined: Apr 29, 2007
Post Count: 128
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computation Error - Insufficient RAM?

Hello Mike,
my goal would be to get the problem really fixed.
Everything should be running fine all time.

Please look here:
https://boinc.mundayweb.com/wiki/index.php?title=Process_got_signal_11

It says:
This error can be caused by the infamous segmentation error (SIGSEGV error), meaning that something broke. be it your memory, virtual memory (page file) or it's a bad batch of tasks.

Check your system with memtest86+, make your page file anew, keep a tab on whether you're the only one returning these errors, or that others have them at the same time as well. If you are the only one, it's something on your system, if there are lots more, it's a bad batch of work.

But it can also happen when you use a 64bit operating system, while the project you're getting the crashes on only gives out 32bit applications (such as Einstein does). Installing the ia32 library package will fix this problem. See also process exited with code 22 which is a similar error, on the application level.




Let us think posiviely and suppose, that it is not a fragmenation error,

If this is the case, you can check, whether you are running a 64 bit or a 32 bit research application taking into account the following information:

Does World Community Grid offer a 64 bit version of the BOINC software?
No. Our recommended BOINC agents are those of the 32 bit variety because even though the client is 32 bit, it will run 64 bit applications. The agent does not do anything that requires 64 bit execution at this time, thus recommending a version that works on both makes life easier. The science applications are available in both 32 bit and 64 bit. A 64 bit science application can run on 32 bit BOINC client. The opposite is also true, a 32 bit science application can run on a 64 bit BOINC client.
Display similar help items
Return to Top

How do I know if my computer is running the 64-bit research application?

On a Windows machine, you can use the Windows task manager to view the process name. 64-bit research applications will end with "windows_intelx86_64", while 32-bit applications will end with "windows_intelx86"



On a Linux machine, you can find the PID of the research application (which will start with the name "wcg") and then execute the command "file -L /proc/PID/exe"

On OS X we only support 64 bit applications, therefore all World Community Grid tasks will be running a 64 bit application.


Maybe this new info helps
Best whishes
Martin
[Dec 17, 2020 7:14:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 13   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread