Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 35
Posts: 35   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5225 times and has 34 replies Next Thread
MarkH
Advanced Cruncher
United States of America
Joined: May 16, 2020
Post Count: 66
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

Thanks to everyone who replied; since my message I've run a few dozen ARP's without incident. But I didn't change anything with the computer or software. Maybe I got hit by cosmic rays on the failed ones.
----------------------------------------
That science of the people, by the people, for the people, shall not perish from the Earth.
[May 16, 2025 2:13:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least one task ending in Error, Error Log apparently showing successful result

I've just checked my latest returns, and I note that I now have three tasks with Validation errors. I doubt it's an issue specific to my system, as wingmen seem to be having the same problem! I can confirm that my errors were "Validate error" and the others show the same characteristics -- an apparently viable result with an unexplained Error status -- so they presumably suffered the same fate.

Here are the WU IDs, names and the current status for my three cases:

WU 750849345  -- ARP1_0001267_150
_0 and _1 failed (Linux), _2 and _3 running (Linux)

WU 751059325 -- ARP1_0002984_150
_0 and _1 failed (Windows); _2 and_3 failed (Linux), _4 and _5 running (Windows)

WU 749954354 -- ARP1_0026315_149
_0 and _1 failed (Linux), _3 failed and _2 and _4 running (Windows)

Elsewhere, adriverhoef has reported similar for three different WUs, so there's something odd going on somewhere...

Cheers - Al.
[Aug 3, 2025 4:11:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
At least TWO tasks ending in Error, Error Log apparently showing successful result

So far today (and the day isn't over yet), 15 16 17 18 19 workunits with TWO or more Validation errors:
workunit 750700147
ARP1_0000393_150_0  Linux Ubuntu  Error  2025-07-31T19:28:27  2025-08-02T12:17:21   10.48/10.50
ARP1_0000393_150_1 Fedora Linux Error 2025-07-31T19:28:41 2025-08-02T23:08:24 6.48/6.52
ARP1_0000393_150_2 This account Error 2025-08-02T23:10:52 2025-08-03T11:26:11 11.72/12.18
ARP1_0000393_150_3 MSWin 11 Error 2025-08-02T23:10:52 2025-08-03T09:15:38 5.01/5.01
ARP1_0000393_150_4 MSWin 11 Valid 2025-08-03T11:26:37 2025-08-03T17:15:34 4.93/5.45
ARP1_0000393_150_5 MSWin 11 Valid 2025-08-03T11:26:40 2025-08-04T08:11:03 8.61/8.65
workunit 750884956
ARP1_0033794_149_0  Linux Ubuntu  Error  2025-08-01T03:43:39  2025-08-02T01:55:49   11.20/11.27
ARP1_0033794_149_1 Linux Ubuntu Error 2025-08-01T03:43:51 2025-08-03T03:54:19 27.96/28.01
ARP1_0033794_149_2 MSWin 11 Valid 2025-08-03T03:54:41 2025-08-04T03:13:38 12.67/13.16
ARP1_0033794_149_3 MSWin 10 Valid 2025-08-03T03:54:42 2025-08-07T06:09:57 14.49/14.81
ARP1_0033794_149_4 MSWin 11 Inval 2025-08-07T01:07:55 2025-08-07T13:21:21 12.17/12.17
workunit 750387046
ARP1_0031990_149_0  Linux Ubuntu  Error  2025-07-31T06:00:41  2025-08-01T19:22:43    6.84/6.87
ARP1_0031990_149_1 Linux Fedora Error 2025-07-31T06:00:45 2025-08-03T04:37:00 6.44/6.48
ARP1_0031990_149_2 MSWin 11 Valid 2025-08-03T04:37:38 2025-08-04T07:50:33 11.08/11.08
ARP1_0031990_149_3 MSWin 11 Valid 2025-08-03T04:37:40 2025-08-03T17:01:02 8.65/12.31
workunit 750849353
ARP1_0033547_149_0  Linux Ubuntu  Error  2025-08-01T02:10:21  2025-08-01T14:58:51    8.56/8.57
ARP1_0033547_149_1 Fedora Linux Error 2025-08-01T02:10:09 2025-08-03T05:36:01 6.39/6.43
ARP1_0033547_149_2 MSWin 10 Valid 2025-08-03T05:36:33 2025-08-03T17:38:19 11.45/11.62
ARP1_0033547_149_3 MSWin 11 Valid 2025-08-03T05:36:34 2025-08-03T18:56:05 6.24/6.24
ARP1_0033547_149_4 MSWin 11 Valid 2025-08-03T18:56:43 2025-08-04T14:08:31 9.95/10.26
ARP1_0033547_149_5 MSWin 10 Valid 2025-08-03T18:56:38 2025-08-04T02:41:22 7.69/7.69
workunit 750472354
ARP1_0033325_149_0  Linux Debian  Error  2025-07-31T09:39:01  2025-08-01T03:22:07    7.55/7.55
ARP1_0033325_149_1 Linux Fedora Error 2025-07-31T09:39:00 2025-08-03T06:44:13 6.62/6.66
ARP1_0033325_149_2 MSWin 11 Valid 2025-08-03T06:44:49 2025-08-03T20:13:06 10.22/10.45
ARP1_0033325_149_3 MSWin Server Valid 2025-08-03T06:44:50 2025-08-04T19:17:21 20.36/20.40
workunit 749756108
ARP1_0030953_149_0  MSWin 11      Error  2025-07-30T04:19:03  2025-08-03T00:51:52   29.69/77.38
ARP1_0030953_149_1 MSWin 10 Error 2025-07-30T04:19:08 2025-08-01T00:47:06 12.33/12.33
ARP1_0030953_149_2 Linux Fedora Error 2025-08-03T00:52:29 2025-08-03T07:37:30 6.58/6.63
ARP1_0030953_149_3 Linux openSU Error 2025-08-03T00:52:30 2025-08-03T05:05:50 4.05/4.05
ARP1_0030953_149_4 Ubuntu Valid 2025-08-03T07:38:07 2025-08-04T15:17:46 11.47/11.76
ARP1_0030953_149_5 MSWin 10 Valid 2025-08-03T07:38:05 2025-08-04T02:27:46 13.54/13.61
workunit 750632570
ARP1_0034825_149_0  Linux Ubuntu  Error  2025-07-31T16:32:52  2025-08-03T00:55:20   32.03/32.17
ARP1_0034825_149_1 Linux Ubuntu Error 2025-07-31T16:32:54 2025-08-02T06:45:36 28.42/30.45
ARP1_0034825_149_2 Fedora Linux Error 2025-08-03T00:56:03 2025-08-03T07:25:29 6.31/6.35
ARP1_0034825_149_3 Linux Debian Error 2025-08-03T00:56:03 2025-08-03T07:32:40 6.02/6.02
ARP1_0034825_149_4 MSWin 10 Valid 2025-08-03T07:33:00 2025-08-04T17:44:21 30.52/34.13
ARP1_0034825_149_5 Ubuntu Valid 2025-08-03T07:33:00 2025-08-04T13:26:14 16.25/16.25
workunit 750633731
ARP1_0034708_149_0  MSWin 11      Error  2025-07-31T16:41:28  2025-08-01T20:54:01   12.77/19.70
ARP1_0034708_149_1 MSWin 11 Error 2025-07-31T16:41:29 2025-08-02T23:25:13 10.49/10.68
ARP1_0034708_149_2 Linux Ubuntu Error 2025-08-02T23:25:54 2025-08-03T08:21:40 8.80/8.83
ARP1_0034708_149_3 Fedora Linux Error 2025-08-02T23:25:56 2025-08-03T05:59:55 6.35/6.39
ARP1_0034708_149_4 MSWin 11 InPrg 2025-08-03T08:22:18 2025-08-06T08:22:18 0.00/0.00
ARP1_0034708_149_5 MSWin 10 Error 2025-08-03T08:22:17 2025-08-03T08:27:01 0.00/0.00
workunit 750954220
ARP1_0001770_150_0  MSWin 11      Error  2025-08-01T06:36:16  2025-08-03T00:02:08   14.05/28.38
ARP1_0001770_150_1 MSWin 10 Error 2025-08-01T06:36:15 2025-08-02T23:30:08 13.81/17.96
ARP1_0001770_150_2 Linux Ubuntu Error 2025-08-03T00:02:27 2025-08-03T09:17:06 9.12/9.15
ARP1_0001770_150_3 Linux Fedora Error 2025-08-03T00:02:31 2025-08-03T07:02:25 6.81/6.86
ARP1_0001770_150_4 Linux Ubuntu Valid 2025-08-03T09:17:36 2025-08-03T17:46:56 8.17/8.37
ARP1_0001770_150_5 Linux openSU Valid 2025-08-03T09:17:30 2025-08-04T06:29:37 8.51/8.53
workunit 750442625
ARP1_0033497_149_0  Linux Fedora  Error  2025-07-31T08:24:27  2025-08-03T10:20:44    6.53/6.57
ARP1_0033497_149_1 Linux Arch Error 2025-07-31T08:24:30 2025-08-01T15:07:46 10.83/11.13
ARP1_0033497_149_2 Linux Ubuntu Valid 2025-08-03T10:21:04 2025-08-03T16:29:42 5.99/6.00
ARP1_0033497_149_3 Linux Ubuntu Valid 2025-08-03T10:21:11 2025-08-04T03:16:56 9.01/9.06
workunit 751103565
ARP1_0003249_150_0  Linux Debian  Error  2025-08-01T13:04:48  2025-08-02T06:10:49   10.17/10.19
ARP1_0003249_150_1 Linux Ubuntu Error 2025-08-01T13:04:51 2025-08-03T04:22:56 19.51/19.52
ARP1_0003249_150_2 Fedora Linux Error 2025-08-03T04:23:33 2025-08-03T11:28:57 6.90/6.94
ARP1_0003249_150_3 Linux Ubuntu Error 2025-08-03T04:23:22 2025-08-03T10:33:56 5.86/5.97
ARP1_0003249_150_4 MSWin 10 Valid 2025-08-03T11:29:16 2025-08-05T00:18:01 27.67/27.97
ARP1_0003249_150_5 MSWin 10 Valid 2025-08-03T11:29:17 2025-08-03T18:54:28 7.38/7.38
workunit 750976341
ARP1_0002097_150_0  Linux Debian  Error  2025-08-01T07:33:15  2025-08-03T01:57:45   10.98/10.99
ARP1_0002097_150_1 Fedora Linux Error 2025-08-01T07:33:07 2025-08-03T12:40:06 6.78/6.82
ARP1_0002097_150_2 MSWin 11 Error 2025-08-03T12:40:29 2025-08-03T21:36:55 7.83/7.87
ARP1_0002097_150_3 MSWin 11 Valid 2025-08-03T12:40:29 2025-08-04T17:05:31 5.58/16.37
ARP1_0002097_150_4 MSWin 10 Valid 2025-08-03T21:37:00 2025-08-05T01:48:52 21.89/22.07
workunit 750676907
ARP1_0033714_149_0  Linux Ubuntu  Error  2025-07-31T18:25:21  2025-08-02T14:43:28    7.92/7.97
ARP1_0033714_149_1 Linux Fedora Error 2025-07-31T18:25:18 2025-08-03T13:18:13 6.52/6.56
ARP1_0033714_149_2 MSWin 11 Valid 2025-08-03T13:18:45 2025-08-05T09:54:09 14.16/22.16
ARP1_0033714_149_3 MSWin 10 Valid 2025-08-03T13:18:46 2025-08-04T15:17:24 18.52/18.99
workunit 750448393
ARP1_0033651_149_0  Linux Fedora  Error  2025-07-31T08:39:14  2025-08-03T13:32:52    6.45/6.50
ARP1_0033651_149_1 Linux Fedora Error 2025-07-31T08:39:29 2025-08-02T03:57:15 7.81/7.92
ARP1_0033651_149_2 MSWin 11 Valid 2025-08-03T13:33:22 2025-08-05T09:51:19 11.83/12.50
ARP1_0033651_149_3 MSWin 11 Valid 2025-08-03T13:33:24 2025-08-03T23:30:29 9.69/9.85
workunit 750907223
ARP1_0035489_149_0  Fedora Linux  Error  2025-08-01T04:41:15  2025-08-03T13:33:32    6.31/6.35
ARP1_0035489_149_1 Linux openSU Error 2025-08-01T04:41:15 2025-08-01T16:17:33 9.95/10.06
ARP1_0035489_149_2 Linux Valid 2025-08-03T13:33:57 2025-08-06T14:22:12 26.36/27.95
ARP1_0035489_149_3 Fedora Linux Valid 2025-08-03T13:34:01 2025-08-04T17:59:43 13.48/14.75
ARP1_0035489_149_4 Linux Valid 2025-08-06T13:34:19 2025-08-07T01:09:51 11.47/11.51
workunit 750407477
ARP1_0033054_149_0  Linux GNOME   Error  2025-07-31T06:55:11  2025-08-03T16:31:16   19.07/19.08
ARP1_0033054_149_1 Fedora Linux Error 2025-07-31T06:55:19 2025-08-02T14:13:24 6.40/6.44
ARP1_0033054_149_2 MSWin 11 Valid 2025-08-03T16:31:46 2025-08-04T07:10:36 7.36/7.36
ARP1_0033054_149_3 MSWin 11 Valid 2025-08-03T16:31:45 2025-08-06T08:52:06 35.05/44.91
workunit 750133541
ARP1_0032459_149_0  Darwin        Error  2025-07-30T19:30:17  2025-08-02T19:07:57    8.69/9.87
ARP1_0032459_149_1 Darwin Error 2025-07-30T19:30:13 2025-08-03T00:52:58 9.86/11.49
ARP1_0032459_149_2 Linux Debian Error 2025-08-03T00:53:32 2025-08-03T17:51:30 7.88/7.88
ARP1_0032459_149_3 Linux Ubuntu Error 2025-08-03T00:53:48 2025-08-03T15:02:36 12.45/12.51
ARP1_0032459_149_4 MSWin 10 Valid 2025-08-03T17:51:47 2025-08-05T19:33:57 9.13/9.13
ARP1_0032459_149_5 MSWin 10 Valid 2025-08-03T17:51:46 2025-08-06T13:29:06 30.12/32.01
workunit 749576011
ARP1_0027763_149_0  Linux Ubuntu  Error  2025-07-29T20:43:46  2025-08-03T08:33:16   24.53/24.53
ARP1_0027763_149_1 Linux Ubuntu Error 2025-07-29T20:43:49 2025-07-30T03:45:38 6.95/6.96
ARP1_0027763_149_2 Linux Error 2025-08-03T08:33:47 2025-08-03T22:17:28 13.56/13.65
ARP1_0027763_149_3 Fedora Linux Error 2025-08-03T08:33:47 2025-08-03T17:48:45 9.03/9.11
ARP1_0027763_149_4 Linuxmint Valid 2025-08-03T22:18:00 2025-08-04T19:38:17 7.56/7.61
ARP1_0027763_149_5 Linux Debian NoRep 2025-08-03T22:18:03 2025-08-06T22:18:03 0.00/0.00
ARP1_0027763_149_6 Linux Ubuntu Valid 2025-08-06T22:18:13 2025-08-07T16:10:39 17.34/17.68
workunit 750596009
ARP1_0032219_149_0  Linux Fedora  Error  2025-07-31T14:59:47  2025-08-03T23:34:15    6.49/6.54
ARP1_0032219_149_1 Linux Ubuntu Error 2025-07-31T14:59:39 2025-08-02T05:07:27 7.76/7.84
ARP1_0032219_149_2 MSWin 11 Valid 2025-08-03T23:34:37 2025-08-04T08:20:00 8.70/8.72
ARP1_0032219_149_3 MSWin 10 Valid 2025-08-03T23:34:38 2025-08-05T02:45:57 26.60/26.75

Adri
----------------------------------------
[Edit 12 times, last edit by adriverhoef at Aug 8, 2025 8:38:51 AM]
[Aug 3, 2025 4:48:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
geophi
Advanced Cruncher
U.S.
Joined: Sep 3, 2007
Post Count: 113
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

Mine from when my linux PC downloaded a retry yesterday

WU -- ARP1_0034000_149
_0 and _1 errored (Windows), _2 and _3 errored (Linux), _4 and _5 running (Linux)

All errors are validation errors.
----------------------------------------
[Edit 1 times, last edit by geophi at Aug 3, 2025 5:22:50 PM]
[Aug 3, 2025 5:21:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

As the MCM1 well has run dry, I noticed an ARP1 _3 resend which immediately errored out, after the initial WUs had resulted in errors after quite some runtime. Got a subsequent _3 resend on another host that immediately started due to the depletion of MCM1 having run out on that one, and that seems to progress normally it seems
https://www.worldcommunitygrid.org/contribution/workunit/750930364

The error on this quickly aborted WU states "can't get input file"...

Ralf
----------------------------------------
[Edit 1 times, last edit by TPCBF at Aug 3, 2025 5:54:35 PM]
[Aug 3, 2025 5:52:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 981
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

Hi!
Are Krembil trying out not to use HR, see this wu: ARP1_002459_149

https://www.worldcommunitygrid.org/contribution/workunit/747428315

_3 is mine!

And a strange one: ARP1_0025673_149

Link: https://www.worldcommunitygrid.org/contribution/workunit/748243394

I thought that anonymous platform did not work or should not be allowed??

Hans S.
----------------------------------------
[Edit 1 times, last edit by Hans Sveen at Aug 3, 2025 7:59:39 PM]
[Aug 3, 2025 7:54:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

I have similar:
https://www.worldcommunitygrid.org/contribution/workunit/750378338 (ARP1_0033245_149).
Cheers,
Mark
[Aug 3, 2025 8:48:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1293
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

looks like I got an ERROR ARP too...
https://www.worldcommunitygrid.org/contribution/workunit/750976340

this next one is interesting. It downloaded. started running and then server aborted.
https://www.worldcommunitygrid.org/contribution/workunit/749798059

never seen it abort a WU I've started already. I would have aborted it myself honestly as it is weird.
----------------------------------------
[Edit 1 times, last edit by Unixchick at Aug 3, 2025 9:45:05 PM]
[Aug 3, 2025 9:40:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

Wow! -- quite a large number of reports, some of which seemed to have had "happy endings" (though I wonder if some of the reported validations can be trusted under the circumstances)...

The catalogue of various mismatched tasks and strange retry issues being listed here and in the "Work available" thread is beginning to look like signs of either database malfunction or corruption, or of some server hardware malfunction.

In particular, it would seem really strange to have abandoned HR (heterogenous redundancy) for ARP1 given the strict data match criteria, and most especially allowing what appears to be an Anonymous Platform to run a task!

It looks as if retries for some WUs have been force-cancelled because the system has realized that it can't use them (it sends a slightly different Abort notice to the client). I've had one of these (see below) and it would explain what Unixchick reported just above, though it doesn't explain why those retries went out in the first place...

Here's the one I saw...

WU 749436789  -- ARP1_0028094
_0 and _1 Errored (Windows), _2 and _3 Valid (Windows), _4 and _5 Server Aborted (Linux)

_3 reported at 21:40:15 UTC, I asked for _5 at 21:40:48 UTC and it downloaded and started a couple of minutes later. However, the next contact with WCG produced the following three relevant entries in the BOINC log the next time it contacted WCG (this is the Server Abort happening)...

Sun 03 Aug 2025 22:45:04 BST | World Community Grid | [error] garbage_collect(); still have active task for acked result ARP1_0028094_149_5; state 9
Sun 03 Aug 2025 22:45:05 BST | World Community Grid | [error] garbage_collect(); still have active task for acked result ARP1_0028094_149_5; state 5
Sun 03 Aug 2025 22:45:05 BST | World Community Grid | Computation for task ARP1_0028094_149_5 finished

Note that state 9 is "Waiting to be sent" and state 5 is "Invalid" (which suggests that validation of the two Windows results had happened at that point and it had realized the two Linux retries would be unusable!)

I do hope they'll tell us what has been going on, even if only via the Krembil WCG Operational Status tab...

Cheers - Al
[Aug 4, 2025 12:53:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
geophi
Advanced Cruncher
U.S.
Joined: Sep 3, 2007
Post Count: 113
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: At least TWO tasks ending in Error, Error Log apparently showing successful result

Mine from when my linux PC downloaded a retry yesterday

WU -- ARP1_0034000_149
_0 and _1 errored (Windows), _2 and _3 errored (Linux), _4 and _5 running (Linux)

All errors are validation errors.

The last two linux PC to download from this work unit were seemingly arbitrarily awarded with a status of Valid. Who knows what really happened here? (rhetorical question)
[Aug 4, 2025 4:18:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 35   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread