Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
Member(s) browsing this thread: ronsteiner , jo1252 , alanb1951
Thread Status: Active
Total posts in this thread: 599
Posts: 599   Pages: 60   [ Previous Page | 32 33 34 35 36 37 38 39 40 41 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 46393 times and has 598 replies Next Thread
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2518
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

If no more work is being sent out, I will run dry late on Sunday.
----------------------------------------
[Edit 1 times, last edit by Grumpy Swede at Nov 7, 2025 3:11:30 PM]
[Nov 7, 2025 3:10:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I'm getting the "tasks are committed to other platforms" and I've run dry.
Well, I get "Server can't open database", which seems to indicate more likely the real issue at hand...
[Nov 7, 2025 4:46:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 995
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Now I get "feeder not running"
[Nov 7, 2025 5:51:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
dylanht
World Community Grid Tech
Joined: Jul 1, 2021
Post Count: 35
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Sorry for not catching this sooner. Feeder indeed died due to lost DB connection. Database crashed waiting for a mutex again, container is set to auto-restart and this time crash recovery did not get stuck on a bad block in the Ceph placement group due to the pg in a peering state, and the database is up. Hosting did say that during the Ceph maintenance they are conducting they expect some operations to be slow. I'm investigating briefly whether I should adjust mariadb config to crash less aggressively and allow a bit more space for this lock contention to resolve on it's own when it happens, I should then be able to restart the feeder this afternoon and bounce anything that died with it's database connection.

About PV jail growing even for new batches, I have been mainly working on this and a one-off "just trust the filesystem" validator/assimlator program to handle the backlog, several bugs I found investigating, and some poorly thought out timeouts and other ill-advised config. So hopefully, getting close to reconciling the backlog.
[Nov 7, 2025 6:18:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Senior Cruncher
Joined: Aug 27, 2021
Post Count: 209
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

dylanht- Thanks for working on this!
[Nov 7, 2025 6:30:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1324
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Prompted by yet another mention of Ceph maintenance, a note about very brief periodic download issues I have been seeing since 2025-10-29...:

These usually seem to happen some time between 16:30 and 17:00 UTC or 18:10 and 18:50 UTC, though the first manifestation was a single [larger] set on 2025-10-29 where my fastest system was offered 9 tasks between 15:52 and 15:55 and every one failed!

I'm also seeing wingmen having the same issues, usually around the same time.

I'm not too bothered by what amounts to less than 2% of a normal day's workload, but if there is an explanation...

Cheers - Al.
----------------------------------------
[Edit 1 times, last edit by alanb1951 at Nov 7, 2025 7:14:29 PM]
[Nov 7, 2025 7:11:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
amsanity
Cruncher
Joined: Nov 6, 2025
Post Count: 3
Status: Offline
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Hi, I'm new.

When can I expect new tasks for my machine to crunch?
It has been idle since this morning :)
[Nov 7, 2025 7:43:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 995
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Hi amsanity!

Please read the 3rd. post above yours, from Dylan 😊

Greetings and welcome to the Forum!

Hans S.
[Nov 7, 2025 7:58:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
shanoaice
Cruncher
Joined: Nov 4, 2025
Post Count: 2
Status: Offline
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Also kind of new here (used to just use Science United), I couldn't seem to change my data sharing settings. When I click on save of that page, it indicates an error, and DevTools shows that the API call to save data sharing pref returns an 403.
[Nov 7, 2025 8:03:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

"And the wheels go round and round..." sad
[Nov 7, 2025 8:09:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 599   Pages: 60   [ Previous Page | 32 33 34 35 36 37 38 39 40 41 | Next Page ]
[ Jump to Last Post ]
Post new Thread