Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 3520
Posts: 3520   Pages: 352   [ Previous Page | 123 124 125 126 127 128 129 130 131 132 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4352526 times and has 3519 replies Next Thread
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thanks, Kevin

Generations up to 079 have progressed in the last 4 days by 29,550 units out of 34,815 units returned, so 85% of total returns. We seem to be cracking the laggards.

We have moved on a generation in the 4 days and the total outstanding units to complete generation 086 is 191,727 compared with 191,544 to complete generation 086 4 days before

Mike.
[Aug 11, 2021 1:20:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2218
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Crystal Pellet posted:
All last 8 ARP-tasks received were from wingman errors from the types:
couldn't start app: Can't get shared memory segment name: shmget() failed
or
couldn't start app: CreateProcess() failed - A required privilege is not held by the client.

Seeing a lot of them at the moment, where this one takes the cake:

workunit 776245489:
ARP1_0008692_087_2--   Linux          -     In Progress            8/11/21 10:55:54    8/19/21 10:55:54    0.00       0.0 / 0.0
ARP1_0008692_087_3-- Linux Fedora - In Progress 8/11/21 10:47:04 8/15/21 22:47:04 0.00 0.0 / 0.0
ARP1_0008692_087_0-- MSWin 10 727 Error 8/11/21 10:40:34 8/11/21 10:42:38 0.00 503.2 / 0.0
ARP1_0008692_087_1-- MSWin 10 727 Error 8/11/21 10:32:23 8/11/21 10:36:44 0.00 0.0 / 0.0
---------------------------------------------------------------------------------------------------------------------------------------
Details:
Project Name: Africa Rainfall Project
Created: 08/11/2021 07:01:16
Name: ARP1_0008692_087
Minimum Quorum: 2
Replication: 2
ARP1_0008692_087_2-- Linux - In Progress 8/11/21 10:55:54 8/19/21 10:55:54 0.00 0.0 / 0.0
ARP1_0008692_087_3-- Linux Fedora - In Progress 8/11/21 10:47:04 8/15/21 22:47:04 0.00 0.0 / 0.0
ARP1_0008692_087_0-- MSWin 10 727 Error 8/11/21 10:40:34 8/11/21 10:42:38 0.00 503.2 / 0.0
<core_client_version>7.14.3</core_client_version>
<![CDATA[
<message>
couldn't start app: CreateProcess() failed - A required privilege is not held by the client.
(0x522)</message>
]]>
ARP1_0008692_087_1-- MSWin 10 727 Error 8/11/21 10:32:23 8/11/21 10:36:44 0.00 0.0 / 0.0
<core_client_version>7.14.3</core_client_version>
<![CDATA[
<message>
couldn't start app: Can't get shared memory segment name: shmget() failed</message>
]]>

[Aug 11, 2021 11:10:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Acibant
Advanced Cruncher
USA
Joined: Apr 15, 2020
Post Count: 126
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

As best I can tell with some searching, the shared memory segment error should have been fixed in an older version but could still occur if somehow more tasks than cores/threads available are started. Citation.

And the privilege not being held by the client seems to stem from a service install of BOINC but in a situation where the account running the service no longer has the appropriate rights. Citation.

Unfortunately we'd have to have the people running those clients examine their own configurations and post them here to have any hope to resolve the issues.
----------------------------------------

[Aug 11, 2021 1:52:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I see that generation 080 has joined the ranks of the stragglers.

I haven't been getting many stragglers. I have mostly been getting re-sends classified as priority. The prblem is that the one normal unit I have keeps being pushed back by new priority cases, so it will take at least 3 days to be returned, and counting. It will infringe the 'reliable' status.

Mike
[Aug 11, 2021 7:26:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1330
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I'm running 8 at a time. All 8 again shorter deadlines.
3 because they are stragglers: generation 075, 077 and 080
and 5 because of wingmen errors.
To keep your reliability, only request new ARP-work, when the running ones are almost ready and evt. push them to the front of the queue by suspending OPN's etc.
[Aug 12, 2021 8:15:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 274
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Hello Mike
I’ve got my first 087: ARP1_0005102_087

Cheers,
Mark
[Aug 12, 2021 9:23:00 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thank you, Mark

087 indicates we are at about 47.5%, but for this month I am again assuming 2 generations behind to allow for the stragglers, so 46.4%.

The latest interval is 4.66806 days and the 10-interval average is down to 2.94800 days. The end date forecast would have been May 2022, but, based on Kevin Reed's data on the stragglers, I expect it to be about October or November 2022.

I would expect the next generation to start about 16 August.

Mike
[Aug 12, 2021 10:17:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Crystal Pellet

Thanks for the reply.

Your suggestion would work but it would require me to keep changing my cache limits up and down every time a unit was about to finish, including overnight.

I have an 8-thread machine so am running 4 ARP using app_config.xml and holding just 1 spare by having a limit of 5 in my profile.

Occasionally, I get one with the standard deadline, so that becomes my spare. Then I get a whole series of priority cases, mostly resends. Each time I finish a unit my spare starts up and then stops again when a new priority unit arrives. It has now amassed 22 minutes and 35 seconds in bursts of, say, 2 minutes every 6 hours or so over the last nearly 3 days. In another day it will be higher in the pecking order than new priority units and will then complete. By the time it gets going I will have had it for 4 days and it will take another day, so 5 days total.

Mine is a reliable machine, almost never producing errors, but this unit will infringe the 'reliability' test of WCG. It is simply an effect of the catch-up exercise on the stragglers.

I think I will put up with it if it gets demoted. It would soon get promoted again.

Mike
[Aug 12, 2021 11:07:06 AM]   Link   Report threatening or abusive post: please login first  Go to top 
leloft
Cruncher
Joined: Jun 8, 2017
Post Count: 23
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I'm running 8 at a time. All 8 again shorter deadlines.
3 because they are stragglers: generation 075, 077 and 080
and 5 because of wingmen errors.
To keep your reliability, only request new ARP-work, when the running ones are almost ready and evt. push them to the front of the queue by suspending OPN's etc.

I have been managing a heavily overloaded work cache through a pro-active use of app_config.xml as trying to do it through device profiles alone just doesn't work. Some work units have developed estimated times that are very close or equal to the deadline.
I have suspended 12 units with the most favourable 'est to deadline' times so that only 12 of the 24 remaining are running; these 12 now utilise all 24 cores to different extents and seem to be reducing the 'est' times significantly: certainly, they appear to progress visibly faster in boinctui. This has led me to consider a new strategy:
If the device profile were set to maintain a small (0.5d) cache with a number of spare units with CPU availability set at 100%, and app_config set to keep a core (or more) free (e.g ARP 12/24, OPN 6/24, MCM 5 (or less)/24) for boinc to use for 'on-demand' parallel processing of sub-tasks, is it possible that more total work could get done per unit time? If so, would 'optimal' cache and app_config values look something like these?
Many thanks
[Aug 12, 2021 11:15:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 142
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Have a device Profile (f.ex. work) with all used Projects.
ARP is set to two tasks, all other unlimited.
No app_config. All is running well.
A second Profile is without ARP for the other Computer.
Edit: Every day about 60 days WCG-work.
----------------------------------------
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
----------------------------------------
[Edit 1 times, last edit by maeax at Aug 12, 2021 12:48:10 PM]
[Aug 12, 2021 12:24:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 3520   Pages: 352   [ Previous Page | 123 124 125 126 127 128 129 130 131 132 | Next Page ]
[ Jump to Last Post ]
Post new Thread