| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 129
|
|
| Author |
|
|
Garrulus glandarius
Advanced Cruncher Romania Joined: Apr 10, 2025 Post Count: 88 Status: Offline Project Badges:
|
Seems like we ran out of tasks again. Last ones I got before midnight UTC were all resends.
----------------------------------------Edit: well, my crying helped, got 3 more resends, but nothing fresh. ![]() ![]() [Edit 1 times, last edit by Lanius collurio at May 25, 2025 6:12:24 AM] |
||
|
|
catchercradle
Senior Cruncher England Joined: Jan 16, 2009 Post Count: 167 Status: Offline Project Badges:
|
Yep, not even resends here. Usually I still get MCM tasks on my phone even when nothing for pc but even that has run out now.
|
||
|
|
Garrulus glandarius
Advanced Cruncher Romania Joined: Apr 10, 2025 Post Count: 88 Status: Offline Project Badges:
|
We're back in business!
----------------------------------------![]() ![]() |
||
|
|
Martin Schnellinger
Advanced Cruncher Joined: Apr 29, 2007 Post Count: 128 Status: Offline Project Badges:
|
Hello friends,
I observe the work of my machine for two weeks Is it possible, that the lack of WUs appears always on weekends I am in German time zone here, and always on Saturdays and Sundays there is a shortages of WUs. What about that? Best wishes MS |
||
|
|
chiara.p
World Community Grid Admin, Mapping Cancer Markers Scientist Joined: Jul 15, 2020 Post Count: 29 Status: Offline |
We have an update on the Lab page (https://www.cs.toronto.edu/~juris/jlab/wcg.html):
MCM1 Workunit Availability: The recurring problem where older servers/VMs in our private cloud lose their DHCP lease on all interfaces and effectively go down, has been causing the quorum of coordinator nodes that accept jobs and assign them to workers ("the scheduler"), to stop accepting job submissions when this crash coincides with a second issue that renders the coordinators unable to recover their quorum. Building on the work we did to generate ARP1 and MAM1 workunits locally on WCG servers, we are migrating the MCM1 workunit delivery to Kubernetes, which should permanently resolve the issue, and increase workunit supply overall. Initially we had planned to complete this work after the release of MAM1 7.05, which has been a significant refactor, but given the frequency of failures we are moving it up and will complete the migration this week. MAM1 7.05 - Why is it Taking so Long? MAM1 is being refactored to use LibTorch and run also on NVIDIA GPUs. LibTorch vastly simplifies the checkpointing logic, and should resolve the unsafe memory access and resume from/checkpointing crashes in previous beta releases as well as significantly improve performance. |
||
|
|
Garrulus glandarius
Advanced Cruncher Romania Joined: Apr 10, 2025 Post Count: 88 Status: Offline Project Badges:
|
Am I the only one not receiving any work on any of my rigs? People seem to be regularly updating the highest task # in the dedicated thread, but I have been struggling to get tasks the past 1-2 days and got no tasks for over 12 hours. Several tablets and phones are running dry and I'm not receiving anything on laptops either.
----------------------------------------![]() ![]() |
||
|
|
MJH333
Senior Cruncher England Joined: Apr 3, 2021 Post Count: 300 Status: Offline Project Badges:
|
Am I the only one not receiving any work on any of my rigs? You are not alone. Have a look at Unixchick’s unofficial status page thread, here. |
||
|
|
brian163
Cruncher USA Joined: Aug 11, 2007 Post Count: 10 Status: Offline Project Badges:
|
I've tried a number of update requests. I'm getting nothing. :-(
No status updates on the Lab page. :-( ¯\_(ツ)_/¯ |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1294 Status: Offline Project Badges:
|
This is a server issue. No fresh WUs are flowing at the moment. you might get a resend if you are lucky
|
||
|
|
wildhagen
Veteran Cruncher The Netherlands Joined: Jun 5, 2009 Post Count: 1004 Status: Offline Project Badges:
|
Is it just me, or is there no work anymore for both MCM1 and ARP1? None of my machines get any new workunits since yesterday evening.
|
||
|
|
|