Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 4
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1598 times and has 3 replies Next Thread
Dr Who Fan
Cruncher
Joined: Mar 12, 2015
Post Count: 38
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
WORKUNIT (2 tasks) Pending Validation for 6+ days

Like title says this ARP1 workunit (2 tasks) has been PENDING VALIDATION for 6+ days.
ARP1_0026012_139
Project name: Africa Rainfall Project
Created: May. 7, 2023 - 04:56 UTC
Name: ARP1_0026012_139
Minimum Quorum: 2
Replication: 2[

ARP1_0026012_139_0 Microsoft Windows 7 Ultimate N x64 Edition, Service Pack 1, (06.01.7601.00) Pending Validation 2023-05-07 05:28:03 UTC 2023-05-12 13:06:58 UTC 38.58 / 42.36 693.2 / 0
ARP1_0026012_139_1 Microsoft Windows 8.1 Professional x64 Edition, (06.03.9600.00) Pending Validation 2023-05-07 05:10:02 UTC 2023-05-11 01:19:10 UTC 27.29 / 27.29 623.7 / 0/code]
https://www.worldcommunitygrid.org/contribution/workunit/301797858

Is there a problem with the validators for ARP tasks?
In the many years I have been doing WCG I have never seen such a long backlog to validate a completed workunit.
----------------------------------------


[May 18, 2023 6:01:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 300
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: WORKUNIT (2 tasks) Pending Validation for 6+ days

Is there a problem with the validators for ARP tasks?
Dr Who Fan,
Yes, ARP validation is taking place very slowly. I have about 60 tasks in this state (“Pval jail”).
There is some speculation as to why this is on the ARP thread Work available. See e.g. alanb1951’s post here.
Cheers,
Mark
----------------------------------------
[Edit 1 times, last edit by MJH333 at May 18, 2023 7:54:04 AM]
[May 18, 2023 7:52:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: WORKUNIT (2 tasks) Pending Validation for 6+ days

Hi, Dr Who Fan - welcome to the [current] world of ARP1 at WCG...
Is there a problem with the validators for ARP tasks?
As Mark said, yes! I suspect that there are various interlocked reasons for the current blockage, but it's nothing new [see below :-)] and the WCG team seems to on the Red Queen's race track[1] so we don't get frequent feedback...
In the many years I have been doing WCG I have never seen such a long backlog to validate a completed workunit.
I presume that means you've not been a frequent runner of ARP1 tasks over the last 9 months or so :-) -- there have been frequent occasions where quantities of tasks have ended up in "PVal jail", and on at least one occasion validated tasks weren't being fully assimilated (the latter probably indicating that the file deleter wasn't active at the time...)

I've been using the WCG APIs to track what happens to all the tasks for each ARP1 workunit I process since the "migration" so I've got lots of data (some of it not in particularly "user friendly" formats!) and have been watching tasks that take far too long to get a quorum and noting [most of the] tasks stuck in PVal jail[2]... I may not know why things are happening, but I sure as heck can see them happening :-)

Looking as far back as the end of August 2022, quite a few WUs were stuck in PVal jail for well over a week... Here's one of my notes at the time:
ARP1_0032608_114 is an "Extreme" unit, and although all three results were returned within 24 hours (well inside the 36-hour deadline) they ended up in PVal jail and were still in that state 11+ days later...
This was a persistent theme through September as well; it seemed to have calmed down by early October...

Since then there have been long periods where no ARP1 work has been available for one reason or another (some explained, some unexplained...), and I didn't see any more PVal issues severe enough to bother noting them until after the March/Aprll 2023 outage...

But now, I've started recording PVal issues again, and it seems to be at least as bad as any previous such event. Some WUs do get validated, but in my case the occasional retries I get to run are actually increasing the PVal count despite those escapees...

Cheers - Al.

[1] In Lewis Carrol’s Through the Looking Glass, the Red Queen tells Alice, “Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!” -- I prefer the shortened version, "running like **** just to stand still...", and I had times like that before I retired :-)

[2] Unfortunately, working out exactly when a work-unit validated is made complicated by the nature of those APIs -- one only lets me track my own results, but if I get lucky I can deduce the validation time; the other is the API underpinning the displays on the new-format web site so it doesn't return as much information about each result.
[May 18, 2023 11:49:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread