| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 3596
|
|
| Author |
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1304 Status: Offline Project Badges:
|
Just got this one.15222_146 https://www.worldcommunitygrid.org/contribution/workunit/726182397
----------------------------------------Hopefully in 5 hours I'll provide a valid WU to this one, and it doesn't need to be added to a list. edit: I got a valid. no need to put it on the list. it moves to the next generation. [Edit 1 times, last edit by Unixchick at Jun 15, 2025 4:08:42 AM] |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
gj82854
I believe that to be wall clock time. Mike |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
Unixchick
That looks headed for stuck status. Please let me know the outcome. Mike |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1327 Status: Offline Project Badges:
|
Is that wall clock time or CPU time? A lot can impact the wall clock time such as suspensions due to higher priority tasks etc. As Mike says, it is wall clock time; it ought to be the difference between the earliest recorded task "sent" time for the WU and the time at which a successful validation marks a canonical result, but I don't think we were ever told exactly what queries underpin the three reports Kevin made for us...As for reasons for late[r] returns: yes, task suspensions may play a part but I suspect other reasons (such as over-subscription?) are equally important if not more so. A more minor (but still relevant) issue might also be folks running too many ARP1 threads at once, which is likely to push their actual run times way out beyond the expected whenever they have "full" buffers. Most CPUs from the last 8 or more years should be able to run a single ARP1 task in well under a day (desktop) or around a day (server or laptop) so with decent configuration (and [perhaps] no interest in badge-hunting[*1]), there's no possible excuse for missed deadlines and the number of tasks taking 4+ days to complete without any error retries might drop too! However, that assumes that users aren't in "fire and forget" mode... Several of us have pondered how these factors could be alleviated so that fast-turnaround systems don't sit waiting hours for work whilst the other [delayed] results are still stuck out there. The unanswered question[*2]. It might be interesting to set up a separate thread to discuss this specific aspect of work return rates, especially if things don't improve considerably when the new work generation and distribution techniques [mentioned in the WCG Operational Status page] and the improved(?) hardware come into play... Cheers - Al *1 A few higher-badge hunters may well run far too many CPU threads on ARP1 at the same time, whilst some others may well effectively overload their buffers :-( -- however, I fear that if badges here were based on results returned rather than CPU time consumed there would not be so many people willing to run long-duration tasks such as ARP1, so perhaps a better solution might be to increase the number of badges made available by using shorter intervals after the 10 year badge?). [Edited to change this paragraph.] *2 Though not the one Charles Ives had in mind :-) [Edit 1 times, last edit by alanb1951 at Jun 15, 2025 6:22:31 AM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1327 Status: Offline Project Badges:
|
Regarding Unixchick's latest "near miss"...
I presume that Apple Silicon users are running the Intel app using Rosetta? (Please confirm/deny...) If that's the case, there are at least two possibilities for mismatched results; one is the simple case of different versions of the CPU handling rounding and near-zero processing in different ways, whilst the other is that if there's not an exact match between an AMD/Intel floating point instruction and the Apple version there may be different alternatives chosen for different Apple CPUs... It would be interesting to know for certain but, unless a code guru who is very familiar with Rosetta chimes in, speculation is all we've got! However, if it is just down to calculation differences, it is likely that simply re-submitting the cell as a new WU will salvage the ones that failed completely [without needing a time step reset :-)] provided that the tasks don't go Darwin again... Cheers - Al. P.S. On a related note, it would appear that Rosetta support will still be around for O/S versions up to macOS 27. Subsequent versions will effectively be "native mode only" (with possible exceptions for some old games!) -- Looks like that gives WCG long enough to sort out MAM1 and MCM1, but I doubt they'll bother with ARP1... |
||
|
|
TonyEllis
Senior Cruncher Australia Joined: Jul 9, 2008 Post Count: 286 Status: Offline Project Badges:
|
alanb1951 scribed...
----------------------------------------Most CPUs from the last 8 or more years should be able to run a single ARP1 task in well under a day (desktop) and some cases even older. The 13 year old Intel i7 3770 (8 CPU - 4 core) here is currently running 3 ARP and 5 MCM wus. Average cpu time is 19.53 and elapsed time 20.19 hours for ARP wus. OS is Fedora 42 boinc 7.20.2 x86_64-pc-linux-gnu
Run Time Stats https://grassmere-productions.no-ip.biz/
|
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
For the record, the workunit in which Unixchick participated:
Result name OS type Status Sent time Due / Return time CPUtime/Elapsed Claimed/Granted[Copied from Workunit Status, generated by wcgformat (using these options: -o)] |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
Well done, Unixchick, you validated the first one.
Mike |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
Sunday Report
Normal units have re-classified to generations 139 - 148. Only 1,954 units in generations up to and including 144 have remained stuck and have probably been joined by 13 from generation 145. There are 3,855 normal units in generation 146 and 28,289 in generation 147 which are the current generations. We are now 21% of the way through generation 147. There are now 1,498 units held in generation 148. 21,466 units have validated in the week, but there are 1,305,757 units to go. Based on the last 5 weeks, we would complete ARP1 in December 2026. However, improvements seem to be coming until July, but we are getting close to where the stuck units will hold the completion up. Mike |
||
|
|
MarkH
Advanced Cruncher United States of America Joined: May 16, 2020 Post Count: 66 Status: Offline Project Badges:
|
Not sure if this contributes to the discussion, Mike, but I'm running ARP1_0023925_146_2 at the moment. So from your info it's a normal unit, wingman 2.
----------------------------------------By the way, what's happening in July? Some kind of system work, or does the world end again?
That science of the people, by the people, for the people, shall not perish from the Earth.
|
||
|
|
|