Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 159
|
![]() |
Author |
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 297 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Waiting to be sent issue back?
ARP1_0034300_140 Project name: Africa Rainfall Project Created: Nov. 9, 2024 - 18:29 UTC Name: ARP1_0034300_140 Minimum Quorum: 2 Replication: 2 Result name OS type OS version Status Sent time Time due/ Return time Cpu time/ Elapsed time Claimed credit/ Granted credit ARP1_0034300_140_0 Linux Ubuntu Ubuntu 20.04.6 LTS [5.4.0-196-generic|libc 2.31] Error 2024-11-09 18:45:11 UTC 2024-11-09 18:47:17 UTC ARP1_0034300_140_1 Microsoft Windows 10 Core x64 Edition, (10.00.22631.00) User Aborted 2024-11-09 18:52:39 UTC 2024-11-11 16:19:58 UTC ARP1_0034300_140_2 Microsoft Windows 7 Home Basic x64 Edition, Service Pack 1, (06.01.7601.00) Error 2024-11-09 18:56:27 UTC 2024-11-20 13:28:51 UTC 475.5 / 0 ARP1_0034300_140_3 Microsoft Windows 10 Professional x64 Edition, (10.00.19045.00) Error 2024-11-16 10:48:36 UTC 2024-11-19 10:48:45 UTC 475.5 / 0 ARP1_0034300_140_4 Microsoft Windows 10 Professional x64 Edition, (10.00.19045.00) No Reply 2024-11-19 10:49:07 UTC 2024-11-22 10:49:07 UTC ARP1_0034300_140_5 Microsoft Windows 11 Core x64 Edition, (10.00.22631.00) Pending Validation 2024-11-20 13:29:05 UTC 2024- 11-20 23:28:30 UTC 9.7 / 9.75 499.8 / 0 ARP1_0034300_140_6 Waiting to be sent |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
And here is a Linux example (workunit 627144075), which also shows the [hopefully rare] down side of having given "time due" a boost.
----------------------------------------I've removed some columns to fit the basic data on the screen... Result Name OS Type Status Sent time (UTC) Time due/returned My wingman monitoring script detected a couple of boosts to the _1 result "time due", one of four days, one of three days, and judging by the deadline interval of 17 days I missed an earlier one because I only had samples from 2024-11-08 onwards :-) -- I suspect that particular user had given up earlier (without aborting tasks or disconnecting, given the result didn't "client abort" -- it does happen...). Perhaps subsequent boosts should only have been given if the interval wasn't already (say) 10 days? -- that said, changes to adjust the rate of ARP1 work release may well mean that the situation won't arise again (for which the WCG Tech Team will undoubtedly be very grateful!...) Cheers - Al. [Edit 1 times, last edit by alanb1951 at Nov 22, 2024 2:18:49 PM] |
||
|
Elle
Cruncher Joined: Sep 30, 2013 Post Count: 2 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello, currently I completely stopped receiving new tasks from WCG. When can I resume it in safety?
----------------------------------------[Edit 1 times, last edit by Elle at Nov 22, 2024 3:22:49 PM] |
||
|
maeax
Advanced Cruncher Joined: May 2, 2007 Post Count: 142 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Win11pro 7.000 Tasks from Krembil without Problems now. (ARP1 and MCM1)
----------------------------------------
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
|
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1950 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All seems fine this morning, with two more ARP1 WUs probably finishing within the hour and nothing stuck on uploads or downloads this morning. Only strange thing is that two WUs that show the status as "no reply" are still running too which are well beyond their normal processing time, about 4-5x slower than would be usual on those two hosts. And they are resends that errored before, so not sure if there's something like a bad batch that is/was causing headaches too... Ralf @TCBF What generation are those WUs from? Just wondering if they're in the Extreme range. If you recall way back when IBM was managing this, some WUs in the extreme generations would fail/get stuck and they would have to half the time slice for those grid squares for a period to get them through the problem. Cheers - Al. Of those two machines I mentioned, the first one is in fact a 32bit system (old Windows Server 2003 that is still running one specific FileMaker Sever application), the second one however is a newish 64bit CAD/graphics system. So while that theory could possibly explain the vastly slower performance on the second system, it would not explain why it runs so much slower in the case of this particular WU than all other 32bit WU on the 32bit server... Ralf ![]() |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1950 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hello, currently I completely stopped receiving new tasks from WCG. When can I resume it in safety? Yesterday....Ralf ![]() |
||
|
Elle
Cruncher Joined: Sep 30, 2013 Post Count: 2 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OK, thank you.
|
||
|
danwat1234
Cruncher Joined: Apr 18, 2020 Post Count: 39 Status: Offline Project Badges: ![]() ![]() ![]() ![]() |
See, everything turned out fine with a plan for long term stability. The next time your (@everyone) machines dry up from WCG, add/resume Sidock, Rosetta, Denis to your BOINC that is the easiest and best course for everyone to equalize load across biology projects. Or Folding. The goal is to efficiently pwn the available work units.
|
||
|
Link64
Advanced Cruncher Joined: Feb 19, 2021 Post Count: 129 Status: Offline Project Badges: ![]() ![]() ![]() ![]() |
The next time your (@everyone) machines dry up from WCG, add/resume Sidock, Rosetta, Denis to your BOINC that is the easiest and best course for everyone to equalize load across biology projects. Or configure them (and/or other BOINC projects since out of those three Sidock ist the only one with reliable WU availability) as backup projects (0 ressource share), than your BOINC client will automatically ask for work from them in case it runs out of WCG work. No babysitting of your machines required.![]() |
||
|
danwat1234
Cruncher Joined: Apr 18, 2020 Post Count: 39 Status: Offline Project Badges: ![]() ![]() ![]() ![]() |
@Link64 agreed!
November 26, 2024 https://www.cs.toronto.edu/~juris/jlab/wcg.html -- If they cannot relocate servers the project will be down for a nearly a month? IMPORTANT: We have been notified of an extended downtime at SHARCNET facilty for construction lasting from December 9th, 2024 to January 3rd, 2025. There will be no power and no cooling during this time. We are exploring temporary migration to another site. We will provide an update on what downtime if any can be expected to start on December 9th, 2024. Overall, this upgrade should provide further improvements to the WCG capacity. Bandwidth has been improved thanks to hosting staff at SHARCNET. In addition, we have more and better hardware devoted to handling downloads and uploads, and a more competent load balancer. ARP1 will resume in limited quantities over the next few days. We will make an effort to focus on extremes as suggested in the forums and test the imposed rate limits on workunit production as well as total bandwidth of all clients and number of connections per client for ARP1 file transfer specifically. Forums were down earlier today for an extended period, they are back up now and we apologize for the slow |
||
|
|
![]() |