Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 3593
Posts: 3593   Pages: 360   [ Previous Page | 346 347 348 349 350 351 352 353 354 355 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5747681 times and has 3592 replies Next Thread
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1293
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Another one to add to the list. This one is "too late"
0003468_148
https://www.worldcommunitygrid.org/contribution/workunit/737134676
[Jul 8, 2025 2:24:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

ARP are flowing again.

Mike smile
[Jul 8, 2025 9:52:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

ARP are flowing again.

Mike smile
Well, that might be a bit of an overstatement. So far, it's a bit of a light drizzle rather than a flowing stream... wink

Ralf cool
[Jul 8, 2025 11:57:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

4+2 is a raging torrent compared with the Atacama that was last week.

Mike wink
[Jul 9, 2025 12:34:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

4+2 is a raging torrent compared with the Atacama that was last week.

Mike wink
Agreed!
I’ve hit my project limits for ARP1, which is the first time this has happened (or got anywhere near) in months. Let’s hope this is not just a one-off!
Cheers,
Mark
[Jul 9, 2025 7:41:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

4+2 is a raging torrent compared with the Atacama that was last week.

Mike wink
Got a light summer rain in the evening, just enough to get a couple more pipes wet... wink


Ralf cool
[Jul 9, 2025 3:30:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

A remarkable workunit, I found:

The task with suffix _1 below was aborted 4 days after being distributed, after reaching 2 checkpoints, reason: "no longer usable".
The task with suffix _2 was distributed within the original deadlines (2025-07-09T16:50) of the tasks with suffixes _0 and _1.

workunit 738775281
ARP1_0014328_148_0  Linux Fedora  P. Validation   2025-07-03T16:50:27  2025-07-04T23:32:45  7.74/7.80  683.3/0.0
ARP1_0014328_148_1 Linux Ubuntu Server Aborted 2025-07-03T16:50:32 2025-07-07T17:34:05 2.62/2.63 0.0/0.0
ARP1_0014328_148_2 Linux Ubuntu In Progress 2025-07-09T05:19:12 2025-07-12T05:19:12 0.00/0.00 0.0/0.0
Details: ----------------------------------------------------------------------------------------------------------------
ARP1_0014328_148_0  Linux Fedora  P. Validation   2025-07-03T16:50:27  2025-07-04T23:32:45  7.74/7.80  683.3/0.0
Sent Time: 2025-07-03T16:50:27+0000
Due Time: 2025-07-09T16:50:27+0000
Returned: 2025-07-04T23:32:45+0000
Logfile:
<core_client_version>8.0.2</core_client_version>
<stderr_txt>
INFO: Initializing
INFO: No state to restore. Start from the beginning.
Starting WRFMain
[18:32:27] INFO: Checkpoint taken at 2019-04-23_06:00:00
[19:41:09] INFO: Checkpoint taken at 2019-04-23_12:00:00
[20:38:35] INFO: Checkpoint taken at 2019-04-23_18:00:00
[21:21:44] INFO: Checkpoint taken at 2019-04-24_00:00:00
[22:23:34] INFO: Checkpoint taken at 2019-04-24_06:00:00
[23:38:35] INFO: Checkpoint taken at 2019-04-24_12:00:00
[00:39:19] INFO: Checkpoint taken at 2019-04-24_18:00:00
[01:25:19] INFO: Checkpoint taken at 2019-04-25_00:00:00
INFO: Simulation complete compressing output.
01:26:30 (685683): called boinc_finish(0)

</stderr_txt>
ARP1_0014328_148_1 Linux Ubuntu Server Aborted 2025-07-03T16:50:32 2025-07-07T17:34:05 2.62/2.63 0.0/0.0
Sent Time: 2025-07-03T16:50:32+0000
Due Time: 2025-07-09T16:50:32+0000
Returned: 2025-07-07T17:34:05+0000
Logfile:
<core_client_version>7.16.6</core_client_version>
<message>
aborted by project - no longer usable</message>
<stderr_txt>
INFO: Initializing
INFO: No state to restore. Start from the beginning.
Starting WRFMain
[04:40:23] INFO: Checkpoint taken at 2019-04-23_06:00:00
INFO: Initializing
Starting WRFMain
[03:20:58] INFO: Checkpoint taken at 2019-04-23_12:00:00

</stderr_txt>

Adri
[Jul 9, 2025 6:28:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
phytell
Cruncher
Joined: Sep 8, 2014
Post Count: 39
Status: Offline
Reply to this Post  Reply with Quote 
Re: Work Available

If you're going to put a significant amount of effort into identifying stuck units, it may be worth noting that Boinc keeps a job log of all completed tasks, which could be processed to identify the highest generation a specific unit has reached. However, I suspect that even if we combined logs from everyone on this thread we'd still be missing a significant amount of the total units. It might be possible to identify those stuck in the first few generations, but anything more ambitious than that would be struggling against a lot of missing data.
[Jul 9, 2025 11:45:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Using the job log is a reasonable suggestion except for WUs like those recent Darwin ones that couldn't achieve validation. They'd be in the job log because the client thinks they are "success" tasks...

The best way to track stuck units would have been for folks to always announce problem tasks; we have seen some of that both in the [distant] past and recently, but I suspect that a lot of tasks have either been "lost" by the system or failed without anyone mentioning them here (often because the user is in "fire and forget" mode and never checks!). We only know about the Darwin issue because Unixchick has been flagging up tasks with validation issues, and I rather suspect there are [many?] more we don't know about!

As for whether the whole thing is an exercise in futility, that might depend on how many of the users who handle (say) 100+ ARP1 tasks a day might provide data. (My average of 15 or so returns a day would be hardly a drop in the ocean, but I would always report any WU that got stuck!)

Cheers - Al.
----------------------------------------
[Edit 1 times, last edit by alanb1951 at Jul 10, 2025 2:14:13 AM]
[Jul 10, 2025 2:11:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
catchercradle
Senior Cruncher
England
Joined: Jan 16, 2009
Post Count: 167
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Interesting,
With just one machine even though with VMs it sometimes pretends to be up to 4 machines, I struggle to get as many as 15 tasks a day. I check my results most days and have been involved with crunching ARP since it came out and have still to notice a stuck work unit be that native Linux, Linux in a VM or Windows in a VM. So we have any figure for the percentage of stuck units?
[Jul 10, 2025 1:23:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 3593   Pages: 360   [ Previous Page | 346 347 348 349 350 351 352 353 354 355 | Next Page ]
[ Jump to Last Post ]
Post new Thread