World Community Grid - View Thread - What is supposed to happen when 'Leave applications in memory while suspended' is set?

World Community Grid Forums

Category: Community

Forum: Chat Room

Thread: What is supposed to happen when 'Leave applications in memory while suspended' is set?

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 7

[ ]

Author

This topic has been viewed 2679 times and has 6 replies

Jean-David Beyer
Senior Cruncher
USA
Joined: Oct 2, 2007
Post Count: 339
Status: Offline
Project Badges:

180 day badge for Human Proteome Folding - Phase 2

45 day badge for Discovering Dengue Drugs - Together

14 day badge for Nutritious Rice for the World

14 day badge for The Clean Energy Project

90 day badge for Help Fight Childhood Cancer

14 day badge for Influenza Antiviral Drug Search

45 day badge for Help Cure Muscular Dystrophy - Phase 2

14 day badge for Discovering Dengue Drugs - Together - Phase 2

1 year badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

90 day badge for Drug Search for Leishmaniasis

90 day badge for GO Fight Against Malaria

10 year badge for Mapping Cancer Markers

90 day badge for Uncovering Genome Mysteries

180 day badge for Outsmart Ebola Together

1 year badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

2 year badge for OpenPandemics - COVID-19


What is supposed to happen when 'Leave applications in memory while suspended' is set?

For many projects, this is one of the items in their requirements. But what is supposed to happen when this is set? It so happens that I do enable this option when requested. But I do not understand what the effect is expected.

In todays multiprogramming environments, with automatic memory management demand paging, if the OS suspends a process to allow another to run, and there is not enough RAM, the memory management part of the OS writes the least recently used RAM out to swap. The only way to stop this, in Linux anyway, is with the use of the mlock(), mlock2(), and munlock() system functions. And to work, the process invoking them must be privileged. Since boinc processes are not privileged, what does leaving suspended processes in memory accomplish?

----------------------------------------

[Nov 9, 2019 5:07:14 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: What is supposed to happen when 'Leave applications in memory while suspended' is set?

Workunits that are running save "checkpoints" from time to time. So when you reboot your computer, they do not proceed at the point where they stopped before reboot, but at the last checkpoint they saved.

Sometimes, workunits are suspended. Either manually by the user, or by BOINC, e.g. to first compute another workunit with a nearing deadline. This has nothing to do with the OS suspending the process for a short time!

Now what happens, if a work unit is suspended and resumed later? If 'Leave applications in memory while suspended' is NOT checked, the behaviour will be the same as with a reboot: the workunit proceeds at the latest saved checkpoint, losing some of the work done.
If the option is checked on the other hand, the process will remain in memory and proceed later exactly where it stopped, no work is lost.

I would suggest to always check that option, unless you have a good reason not to. This could be the case for projects consuming large amounts of memory and you are short of memory.

[Nov 9, 2019 7:43:51 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: What is supposed to happen when 'Leave applications in memory while suspended' is set?

Hi Jean-David,

I'm far from the most technical person on the forums, but my first point would be to note that it is possible to, and some do (including me), run with swapping off, as swapping also takes system resources and slows things down. I prefer to run with enough memory to do what I want, and try not to overload the system.

My second point is that BOINC does application scheduling of its own. I don't run multiple projects (I'm counting WCG as a single project, and I regard all the 'projects' that WCG runs as sub-projects) and, when switching between projects, it may choose to save system resources by unloading the project being preempted. (There may be other situations; I don't know much in this area.) I'm sure it does its best not to lose too much processing time, but with things like ARP that run for many hours between checkpoints, I feel sure that processing would be lost unless LAIM was set. That makes it a good idea to recommend its use in such circumstances.

I'm sure others can provide more/better information, but I hope this helps.

[Nov 9, 2019 7:53:22 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: What is supposed to happen when 'Leave applications in memory while suspended' is set?

The setting is one time global applying to all projects on a client.

Project swapping is in principle only done at checkpoint unless a high priority job pops up. It's those moments when you want to keep a job in memory, particularly if the intervals are many hours apart, else the job resumes from the previous checkpoint, the reason why WCG recommends it for ARP1.

The exception is the first checkpoint. A running task is kept in memory until it reaches the first checkpoint.

[Nov 9, 2019 9:10:59 PM]

Jean-David Beyer
Senior Cruncher
USA
Joined: Oct 2, 2007
Post Count: 339
Status: Offline
Project Badges:


Re: What is supposed to happen when 'Leave applications in memory while suspended' is set?

Project swapping is in principle only done at checkpoint unless a high priority job pops up. It's those moments when you want to keep a job in memory, particularly if the intervals are many hours apart, else the job resumes from the previous checkpoint, the reason why WCG recommends it for ARP1.

Project swapping under control of the boinc client may follow the rules you suggest. But the Linux process scheduler follows its own rules, as does its memory management. Since these can happen at any time, and since it has no idea about any boinc checkpointing, this parameter to the boinc client makes little difference to what really happens.

At some point, some Linux process may need more and more RAM and to get it, some boinc tasks may not only be suspended, but their RAM could be swapped out. Now just before, or soon after (as needed) swapped out pages may be swapped back in. And the running process has no way of knowing that this has happened.

So what practical difference does this make?

----------------------------------------

[Nov 10, 2019 8:24:42 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: What is supposed to happen when 'Leave applications in memory while suspended' is set?

So what practical difference does this make?

If LAIM is off, BOINC will remove it from memory.
If swapping is off, all that you say happens, doesn't happen.
Edited to add:
Which means that, when a WU that has been removed from memory restarts it will HAVE to restart form the last checkpoint. This is not good if an ARP1 WU has been running for four hours since the previous checkpoint -- four hours work lost!

----------------------------------------
[Edit 1 times, last edit by Former Member at Nov 10, 2019 8:35:21 PM]

[Nov 10, 2019 8:30:46 PM]

hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:

45 day badge for Help Cure Muscular Dystrophy

20 year badge for Mapping Cancer Markers

1 year badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: What is supposed to happen when 'Leave applications in memory while suspended' is set?

Jean-David Beyer said:

So what practical difference does this make?

I think you're letting swapping confuse you with this setting. It has nothing to do with it. LAIM has everything to do with checkpointing: If a task is suspended for any reason with LAIM off, or if the system is rebooted or shut down, the work unit resumes from the most recent checkpoint, losing all work since the last checkpoint (which could be hours of work).

If LAIM is enabled, however, when a task is suspended and then resumed, instead of resuming from the most recent checkpoint (from disk), it resumes exactly where it left off.

I can't think of any reason to have it disabled. It's a good setting.

----------------------------------------

i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Nov 11, 2019 2:53:33 AM]

[ ]