Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 81
|
![]() |
Author |
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the update TigerLily.
I was also wondering about the WU's listed as REJECTS: I have entries dating back to Early November of last year as shown below (color highlights on "STATUS " and "RETURN TIME " for clarity).: Results Result name Device name Status Sent time Time due Return time Cpu time/Elapsed time Claimed credit/ Granted credit MCM1_0211750_1649_2 DESKTOP-6O7HS7C Server Aborted 2024-01-30 12:09:00 UTC 2024-02-02 12:09:00 UTC 2024-01-30 12:39:26 UTC MCM1_0211671_9414_2 DESKTOP-6O7HS7C Server Aborted 2024-01-29 16:07:37 UTC 2024-02-01 16:07:37 UTC 2024-01-29 16:32:01 UTC MCM1_0211899_1292_1 WCG2c04-GKJ31U9 Error 2024-01-26 11:18:44 UTC 2024-02-01 11:18:44 UTC 2024-01-27 04:41:30 UTC 0.5 / 0.51 22.6 / 0 MCM1_0211829_7963_0 DESKTOP-76EJN7O Error 2024-01-25 20:36:44 UTC 2024-01-31 20:36:44 UTC 2024-01-26 0:05:43 UTC 1.75 / 1.86 73.6 / 0 MCM1_0209892_7418_0 DESKTOP-CKVUFHC User Aborted 2023-12-28 08:37:58 UTC 2024-01-03 08:37:58 UTC 2023-12-28 10:06:03 UTC MCM1_0209463_5138_1 DESKTOP-76EJN7O Error 2023-12-18 05:19:36 UTC 2023-12-24 05:19:36 UTC 2023-12-18 16:19:06 UTC 0.82 / 0.84 26.1 / 0 MCM1_0207535_3291_1 WCG3G06-S3VR7C3 Error 2023-11-05 17:45:36 UTC 2023-11-11 17:45:36 UTC 2023-11-05 18:06:00 UTC 0 / 0 0.1 / 0 MCM1_0207301_6747_1 DESKTOP-UADN027 Error 2023-11-02 08:55:13 UTC 2023-11-08 08:55:13 UTC 2023-11-02 10:05:05 UTC 0.78 / 0.79 20.8 / 0 MCM1_0207301_6751_1 DESKTOP-UADN027 Error 2023-11-02 08:55:13 UTC 2023-11-08 08:55:13 UTC 2023-11-02 10:05:05 UTC 0.53 / 0.53 14.1 / 0 MCM1_0207315_9710_1 DESKTOP-UADN027 Error 2023-11-02 08:55:13 UTC 2023-11-08 08:55:13 UTC 2023-11-02 10:29:04 UTC 1.23 / 1.24 32.8 / 0 MCM1_0207315_9697_1 DESKTOP-UADN027 Error 2023-11-02 08:55:13 UTC 2023-11-08 08:55:13 UTC 2023-11-02 10:29:04 UTC 1.45 / 1.46 38.8 / 0 |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 929 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
bfmorse - regarding those Errors and User Aborted tasks... None of the results from a WU will be removed until the workunit can be purged, so I reckon that's why those are still hanging around.
By the way, I run on such a short turn-around (small queues!) that I never see one of those errors where the return time is a nice neat 6 days after the sent time -- I presume those were what other BOINC systems would flag as "Not Started by Deadline" :-) Cheers - Al. |
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 914 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
TigerLily had suggested that they might start running MCM assimilators on Monday. I haven't seen any motion in my own result list, but you guys can see more than I can. Anything happening?
|
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 929 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
TigerLily had suggested that they might start running MCM assimilators on Monday. I haven't seen any motion in my own result list, but you guys can see more than I can. Anything happening? I haven't seen any obvious signs of the other three [initial] assimilators having achieved anything regarding my results so far. This could be because all my relevant results that have been considered so far happen to be ones that'll get flagged up as problematic, but I somehow doubt that's the case (unless the volume of data problems is far greater than they were expecting...) It might be that they fired one or more of the extras up and something went wrong -- I suspect there are some wrinkles that might not show up until working against the huge number of records in the production database. Then again, they might have been over-optimistic with their time scale and it simply hasn't happened yet. Here's a snapshot of the state of play for my results at about 23:45 UTC on 2024-01-30, using the same script I used earlier in this thread. Presuming the delay before purge is still 24 hours, this should have counted everything marked as ready to purge over the previous 24 hours, and there's only stuff from the single assimilator... Examined data collected on 2024-01-30 (In that 24 hour interval, about 180 results validated and about 50 were purged.) Adri's larger dataset might produce a more definitive answer, but I fear it won't yet be positive :-( Cheers - Al. [Edit 1 times, last edit by alanb1951 at Jan 31, 2024 12:44:24 AM] |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2145 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Not yet. The tech team should still be monitoring the case, as it was mentioned that 48 hours were needed, if I'm understanding this correctly.
Adri |
||
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Alanb1951 - Thanks for your comments. I also run a small queue:
Workunit Cache Settings Connect to network about every 0.06 days Cache 0.0 extra days of work Periodically, I get a lot of resends spread over my "farm". When home, I try and watch for them: set those WU's to "running" by re-setting all others, as appropriate, to SUSPEND then re-enable processing after all the resends are RUNNING. Since I list the WU's in "Time due" order, it is easy to spot those that may be delayed. I just wish the system would process the WU's in "Time Due" order - or at least allow us the option of choosing that processing order. If that option exists, I would appreciate knowing about it. Thanks, Bruce |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 929 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Bruce,
As well as a small queue size I also restrict the number of tasks that can be sent by capping the numbers in the WCG device profiles. I don't let any machine fetch more work than it can turn around in half a day,,, I initially used this as a defence mechanism because I also use max_concurrent options which caused older clients to keep fetching unneeded work because of a bug! Anyway, the end effect is that my clients never need to go into "panic mode" on WCG tasks; they return everything promptly (unless I have network problems!) so I don't ever need to resequence work... The down-side [if any] is that at system crisis times I run out of work, but I don't see that as a problem :-) Cheers - Al. P.S. Linux in use here; client 7.20.5 fixed the issue for me -- not sure about which Windows version actually fixed it... |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2138 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Still no purging of my Results list. It's Thursday here now, and TigerLily said "The new version of the assimilator should be up and running Monday"
But then he/she didn't say which Monday..... ![]() |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7633 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Bruce,
----------------------------------------Anyway, the end effect is that my clients never need to go into "panic mode" on WCG tasks; they return everything promptly (unless I have network problems!) so I don't ever need to resequence work... The down-side [if any] is that at system crisis times I run out of work, but I don't see that as a problem :-) I agree with Alan on this. It is easy to cap the number of work units by the number in the profile. BOINC has a pretty good self regulating mechanism to avoid going into panic mode if pretty much left alone except in some really odd circumstances. My profiles are set to have a turn around time of 24 hours or so. As Alan says, the downside is occasionally running out out of work. With the really steady supplies for the moment this had not been a problem. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks to all for your comments,
----------------------------------------For the most part, I have adjusted the PROJECT LIMITS of MCM1 tasks (and others) to where it is equal to the number of threads (or less depending on the WU's and system's needs). Previously it was significantly more - but the queue settings have remained unchanged and are described in a previous message above. This reduction in PROJECT LIMITS appears to have eliminated my need to monitor my systems for RESENDS and tweak things to process those first. Once the system starts processing additional projects, my queue may slightly increase. But I'll cross that bridge when I get to it. Looking forward to having the quantity of my VALID WU's start dropping significantly - currently it is OVER 120,000. I also have two sets of WU's PENDING VERIFICATION that each set has two completed WU's and a third WAITING TO BE SENT. I can provide more info if needed. Happy Crunching, Bruce Edit to correct typo. [Edit 1 times, last edit by bfmorse at Feb 2, 2024 10:23:53 PM] |
||
|
|
![]() |