Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
Member(s) browsing this thread: AgrFan
Thread Status: Active
Total posts in this thread: 3520
Posts: 3520   Pages: 352   [ Previous Page | 233 234 235 236 237 238 239 240 241 242 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4353546 times and has 3519 replies Next Thread
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1114
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

wow. new ARPs that is interesting. I haven't gotten any in a while, but I'll keep hoping.
[Dec 25, 2022 5:11:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

According to ....generations.txt there were 22 extreme units validated in the last 24 hours, averaging 20 hours to complete.

Mike
[Dec 26, 2022 1:46:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Sunday Report - Xmas special

There has been a lack of data for the last 5 weeks due to a pause and then a data publication stoppage. About 34,000 units validated in 5 weeks so an average of under 1,000 per day.

Assuming that a full generation 182 will be the last, there are 1,698,665 units still outstanding, so my forecast end date would now be 17 October 2027, however we are still coming out of testing so we should finish well before then.

The definitions of normal, accelerated & extreme have moved on to generations 142, 132 & 127, respectively.

There are 38 Extremes and 23 Accelerated units, although the numbers in their generations are 1,465 & 4,381 due to lack of movement/change of definition..

Some extremes have been released but I haven't seen them myself.

Mike
[Dec 26, 2022 2:07:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2218
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

There are 38 Extremes and 23 Accelerated units, although the numbers in their generations are 1,465 & 4,381 due to lack of movement/change of definition..

Mike, could you explain what the file state.txt is trying to make clear?
+--------------+----------------+-------------+
| number_units | max_generation | type |
+--------------+----------------+-------------+
| 23 | 132 | Accelerated |
| 38 | 127 | Extreme |
| 29125 | 142 | Normal |
+--------------+----------------+-------------+

You say: "There are 38 Extremes and there are 1465 in their generations". When I count the number of Extremes in generations.txt, they (1+1+1+2+1+2+2+4+2+2+7+2+2+4+3+4+1+5+4+2+9+4+1+4+5+2+3+7+20+96+519+743) add(ed) up to 1465 (yesterday). What is that number of 38 then?
[Dec 26, 2022 1:19:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

adri

My guess is that the lower number is those that are "registered" (my quotes and terminology) due to lack of movement of the rest.

For example, when the classification of a generation moves on, those in that newly classified generation I presume do not get registered as the new class until they have been distributed.

This is just a guess as so little information as to what is going on at Krembil is disseminated by them.

IBM used to clarify some of my queries when they were running things and those 3 reports were established as a consequence of my questions.

" The blind leading the blind" seems appropriate.

Mike
[Dec 26, 2022 6:19:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1065
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Mike, Adri;

Firstly, regarding the counts given in state,txt -- up until 31st October 2022 (inclusive) the counts seemed to include every grid cell, whereas from 1st November onwards the counts now only sum to 29186...

On 31st October 2022 state.txt was as follows
+--------------+----------------+-------------+
| number_units | max_generation | type |
+--------------+----------------+-------------+
| 102 | 124 | Extreme |
| 168 | 130 | Accelerated |
| 35339 | 140 | Normal |
+--------------+----------------+-------------+

whilst on 1st November it was as follows
+--------------+----------------+-------------+
| number_units | max_generation | type |
+--------------+----------------+-------------+
| 47 | 124 | Extreme |
| 20 | 130 | Accelerated |
| 29119 | 140 | Normal |
+--------------+----------------+-------------+

with 6423 cells no longer included.

It doesn't directly answer the question as to what is being counted, but(assuming that some aspects of the generation management take place on the non-BOINC database) it suggests that there may be some issue in the connection between the two databases, as often mentioned in web-site related contexts. It would be interesting to see the queries used to generate the three reports [for a user-based forensic effort] , but I don't think we're likely to get lucky in that matter :-)

And on that note, Mike remarked
This is just a guess as so little information as to what is going on at Krembil is disseminated by them.
I rather suspect that as far as Igor's team are concerned this is part of the "IBM set it up and we're leaving it alone unless it is an obvious source of system failure" section which will [slowly] get resolved as they sort out the more critical aspects of the inter-database communication problems. I wonder how much of the stuff that worked fine in an all-IBM environment was taken for granted when documenting systems; do we know how much worthwhile documentation was provided? Mike's "blind leading the blind" assessment may be nearer the truth than one might hope would be the case...

----

On the subject of Adri's one-off work unit -- do we officially know that the work generation for each category is effectively separated out? There's plenty of evidence in the current generation shifts and average completion times to suggest that there's a constant [but small] trickle of Extreme units and that the same may be true for Accelerated units...

I actually looked at the work units surrounding Adri's oddity - there was one other ARP1 unit there, and everything else appeared to be MCM1. Given the reasonably swift turn-around being reported for Extremes at present, it must happen once or twice a day -- perhaps it's regulated by the number of units that actually get assimilated and purged on a given day?

The above may also apply to Accelerated cells, but it would appear that Normal cells may not get another batch issued until all stragglers from the last batch have returned. If anyone can tell us what the real situation is regarding triggers for new work, I'd love to know!

If there is actually a trickle of work going out for those categories the odds against any of the regular ARP1 posters getting one would be pretty remote, I suspect -- sorry, unixchick :-( -- so we'd be none the wiser about the new work...

Cheers - Al.
[Dec 26, 2022 10:40:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2218
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Al, that is an exc-Al-lent observation, showing these two tables of state.txt from the recent past, not older than two months. On one day, the total number of workunits (or grid cells) in state.txt suddenly dropped from 35609 to 29186 ...
with 6423 cells no longer included.

Al suggested:
it would appear that Normal cells may not get another batch issued until all stragglers from the last batch have returned.

Apart from that I've 'caught' one new Normal task - and nothing else, ARP1-wise - this morning:
workunit 236797124
App: Africa Rainfall Project
Workunit: ARP1_0030835_135
Created: 2022-12-27T06:48:43
Quorum: 2
Replication: 2

ARP1_0030835_135_0 Linux Debian In Progress 2022-12-27T07:00:23 2023-01-02T07:00:23
ARP1_0030835_135_1 Fedora Linux In Progress 2022-12-27T07:00:21 2023-01-02T07:00:21

(A quick research - looking at 500 workunits before as well as after that workunit-ID, that's only 3 minutes apart - showed no other ARP1-tasks.)
The workunit shown above was generated this morning, so, this is probably only a tiny trickle, a casual passer-by.

The other machine of mine is still processing 12 ARP1-tasks on all 12 of its threads. biggrin They are all resends from generation 135.
----------------------------------------
[Edit 2 times, last edit by adriverhoef at Dec 27, 2022 12:50:50 PM]
[Dec 27, 2022 12:45:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12564
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I noticed when units started to flow a couple of weeks ago that they were releasing in generation order starting with 134, then 135 next day, then 136, etc.
I guess it was their way of staggering the flow. I doubt they were being returned that quickly.

Mike
[Dec 27, 2022 2:27:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2218
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

That's an interesting viewpoint, Mike. I want to share mine also, if you don't mind.
Starting to look from 1 November, I see the first tasks appear at 4 November. First a few Extremes (116, 110, 119, 124, 108, 123, 117, 123), then a few 130 and 131, followed by 132 on one machine and twelve hours later 133 to 139. Then resends, the last one on 18 November.
After a long wait and tapping Krembil on the shoulder 'can we get more?', on 8 December two resends, the next day 132 and 133, the day after 135, in the days that follow I'm seeing 136 to 140, then resends.
On 19 December I'm starting again with 134 and 135. Then resends. And, on 24 December, this sudden 117, and 27 December, one new 135.

There were two questions that I wanted to ask Cyclops. One was what the query is to generate state.txt, to shed some light on the matter, the other was what is happening to the many generations that seem to be 'stuck'. I have saved generations.txt from 30 November 2022 and selected the colums
'generation' and 'num_units_currently_on_generation':
014        | 1
016 | 1
017 | 1
098 | 2
099 | 1
100 | 2
101 | 2
102 | 4
103 | 2
104 | 2
105 | 7
106 | 2
107 | 2
108 | 4
109 | 3
110 | 4
Their values (num_units…) haven't changed in four weeks.
[Dec 27, 2022 3:35:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1114
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I think when ARP ran into trouble and things weren't working for them, they had a tech at IBM to help them. They reformulated the WUs to be more granular and restarted them from a certain generation, or started them all over. I'm guessing they don't have that level of expertise or attention anymore. They might have to abandon the difficult areas.

We know that ARP has had a drive/tape space issue, so they might not have had time to look at the difficult WUs to manage them. I hope they can restart/rewind the ones that need that once the holidays are done, and they have the time to do that.
[Dec 27, 2022 4:17:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 3520   Pages: 352   [ Previous Page | 233 234 235 236 237 238 239 240 241 242 | Next Page ]
[ Jump to Last Post ]
Post new Thread