| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| Member(s) browsing this thread: ronsteiner , jo1252 , alanb1951 |
|
Thread Status: Active Total posts in this thread: 599
|
|
| Author |
|
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2518 Status: Offline Project Badges:
|
If no more work is being sent out, I will run dry late on Sunday.
----------------------------------------[Edit 1 times, last edit by Grumpy Swede at Nov 7, 2025 3:11:30 PM] |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
I'm getting the "tasks are committed to other platforms" and I've run dry. Well, I get "Server can't open database", which seems to indicate more likely the real issue at hand... |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 995 Status: Offline Project Badges:
|
Now I get "feeder not running"
|
||
|
|
dylanht
World Community Grid Tech Joined: Jul 1, 2021 Post Count: 35 Status: Offline Project Badges:
|
Sorry for not catching this sooner. Feeder indeed died due to lost DB connection. Database crashed waiting for a mutex again, container is set to auto-restart and this time crash recovery did not get stuck on a bad block in the Ceph placement group due to the pg in a peering state, and the database is up. Hosting did say that during the Ceph maintenance they are conducting they expect some operations to be slow. I'm investigating briefly whether I should adjust mariadb config to crash less aggressively and allow a bit more space for this lock contention to resolve on it's own when it happens, I should then be able to restart the feeder this afternoon and bounce anything that died with it's database connection.
About PV jail growing even for new batches, I have been mainly working on this and a one-off "just trust the filesystem" validator/assimlator program to handle the backlog, several bugs I found investigating, and some poorly thought out timeouts and other ill-advised config. So hopefully, getting close to reconciling the backlog. |
||
|
|
Boca Raton Community HS
Senior Cruncher Joined: Aug 27, 2021 Post Count: 209 Status: Offline Project Badges:
|
dylanht- Thanks for working on this!
|
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1324 Status: Recently Active Project Badges:
|
Prompted by yet another mention of Ceph maintenance, a note about very brief periodic download issues I have been seeing since 2025-10-29...:
----------------------------------------These usually seem to happen some time between 16:30 and 17:00 UTC or 18:10 and 18:50 UTC, though the first manifestation was a single [larger] set on 2025-10-29 where my fastest system was offered 9 tasks between 15:52 and 15:55 and every one failed! I'm also seeing wingmen having the same issues, usually around the same time. I'm not too bothered by what amounts to less than 2% of a normal day's workload, but if there is an explanation... Cheers - Al. [Edit 1 times, last edit by alanb1951 at Nov 7, 2025 7:14:29 PM] |
||
|
|
amsanity
Cruncher Joined: Nov 6, 2025 Post Count: 3 Status: Offline |
Hi, I'm new.
When can I expect new tasks for my machine to crunch? It has been idle since this morning :) |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 995 Status: Offline Project Badges:
|
Hi amsanity!
Please read the 3rd. post above yours, from Dylan 😊 Greetings and welcome to the Forum! Hans S. |
||
|
|
shanoaice
Cruncher Joined: Nov 4, 2025 Post Count: 2 Status: Offline |
Also kind of new here (used to just use Science United), I couldn't seem to change my data sharing settings. When I click on save of that page, it indicates an error, and DevTools shows that the API call to save data sharing pref returns an 403.
|
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
"And the wheels go round and round..."
![]() |
||
|
|
|