Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 16
Posts: 16   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2725 times and has 15 replies Next Thread
leloft
Cruncher
Joined: Jun 8, 2017
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Manually control order of workunits

Hello. My work cache contains more work than be done in the deadlines. Specifically, it contains 48 ARP units with estimated times of 72 hrs; 21 of these are being worked with estimate times to completion of between approx 1 to 21 hrs, the remaining ones have deadlines of approx. 96 hrs. However, the work cache also contains 23 OPD units (est times of 6h, deadlines 46h). I have already manually aborted 3 (est 72hr) ARP units 48h ahead of their deadlines in the hope that they will be picked up as stragglers.
Is there any way that I can manually prioritise the processing of the ARP units above the OPD ones to minimise the number of work units that have to get aborted?
I'm not sure how this overload happened, although I have been having problems with the device profiles not being enforced.
* edit: I have set 'no more work' via boinccmd until the backlog is cleared and set memory use to 100%.
----------------------------------------
[Edit 1 times, last edit by leloft at Aug 10, 2021 9:09:00 AM]
[Aug 10, 2021 8:55:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2089
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

Is there any way that I can manually prioritise the processing of the ARP units above the OPD ones to minimise the number of work units that have to get aborted?

Yes.
You could make use of the file app_config.xml and limit the number of concurrent OPN-tasks, like this:
$ cat > app_config.xml <<+
<app_config>
<app>
<name>opn1</name>
<max_concurrent>5</max_concurrent>
</app>
</app_config>
+
(Setting 5 OPN1-tasks as the limit, as an example.)

Put the file app_config.xml into BOINC's subdirectory projects/www.worldcommunitygrid.org/ and force re-reading of the config files (e.g. through the following command:)
boinccmd --read_cc_config

Just for fun, if you have installed the file correctly, try running this command:
boinccmd --get_app_config http://www.worldcommunitygrid.org
(It will show BOINC's understanding of the file's contents.)
----------------------------------------
[Edit 1 times, last edit by adriverhoef at Aug 10, 2021 10:21:01 AM]
[Aug 10, 2021 10:00:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
leloft
Cruncher
Joined: Jun 8, 2017
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

Thank you. That was far more straightforward than I had hoped! Very clear and helpful instructions.
[Aug 10, 2021 10:31:13 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

leloft

Going forward to prevent recurrence of the problem, you should amend the cache limits in your Device Profiles. Currently, ARP units are readily available (within an hour) and OPN & MCM are instantly available. There is no need to hold more than a few spares in excess of the numbers being crunched.

For instance, for an 8 thread machine, could have app_config.xml set to crunch 4 ARP, 3 OPN & 2 MCM. The maximum recommended is half of threads for ARP and the total of 1 over the total threads allows for shortages.

Then the profile could be set to a maximum of 5 ARP, 4 OPN & 3 MCM so there is always 1 spare of each to allow for the time between completing a unit and the next one being downloaded.

For a different number of threads available, scale those figures up or down in proportion.

The more you hold in cache, the less likely your machine is to be considered as 'reliable' by WCG. It also slows down the production of new ARP units and also getting your wingman's units validated.

If you have more than one machine then app_config.xml should be installed on each machine. The max_concurrent can be different on each machine or the same, but stick to the maximum of 50% of threads for ARP. You can have different profiles on different machines or use one for all machines.

Mike
[Aug 10, 2021 12:13:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
leloft
Cruncher
Joined: Jun 8, 2017
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits


If you have more than one machine then app_config.xml should be installed on each machine. The max_concurrent can be different on each machine or the same, but stick to the maximum of 50% of threads for ARP. You can have different profiles on different machines or use one for all machines.

Thank you. I have set up app_config.xml on the three machines that are using arp1, using the parameters you suggest for each.
Many thanks
[Aug 10, 2021 2:01:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

I should have mentioned that you have to activate app_config.xml in each machine by clicking on Options and then Read Config files each time you make a change.

Mike
[Aug 10, 2021 2:20:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hiimebm
Senior Cruncher
United States
Joined: Oct 19, 2014
Post Count: 305
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

App_config would work but is not necessary here, since as mentioned you can control the max # of workunits from each individual project from your Device Profiles page. You may also want to set the queue to "0" days in the Manager software itself
----------------------------------------

[Aug 10, 2021 2:23:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

Actually, app_config.xml is necessary because the other projects are so much shorter and there would be an imbalance if you hold a spare or spares in cache. ARP would hog the machine to the limit of its cache most of the time.

Mike
[Aug 11, 2021 1:38:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
leloft
Cruncher
Joined: Jun 8, 2017
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

I have taken and implemented all the advice given over the last few days. I have just had to manually abort over a hundred ARP units that have been sent to 2 (4-core) machines in the last few hours. Over 50 of them were downloaded even after i issued nomorework and updated the project. I was able to prevent the download of more work by suspending network activity. I have changed the profile of both machines to default (no ARP) while they chew their way through a few hundred OPN/MCM units.

I am at a loss. How could this possibly have happened: the shared profile was set to 2 ARP, 2 OPN and 1 MCM, with a work cache of 1 day.

This is also a heads up to the project admins: there are a hundred or so ARP units that have just been aborted. Acording to my results status, several of them appear to be processed, but i only aborted the waiting and downloading ones.

I'd very much appreciate hearing from someone who is running a debian buster build of boinc 7.16.16!
[Aug 13, 2021 2:03:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Manually control order of workunits

There may be some BOINC bugs with the use of max_concurrent that can cause constant fetches up to 1000 tasks.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,43530
Probably can try to avoid BOINC per app max_concurrent which have some bugs.

Use this website, Settings, device manager, choose a profile, scroll down to project limits. Check all devices and profile, some might be set to unlimited or something. After some changes is made, press Save.
[Aug 13, 2021 5:11:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 16   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread