Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 102
Posts: 102   Pages: 11   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4443 times and has 101 replies
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 769
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

From https://www.cs.toronto.edu/~juris/jlab/wcg.html
"March 4, 2025

Services seem to be down. We are working on identifying and fixing the issue.
"
----------------------------------------
Paul.
[Mar 4, 2025 1:54:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 818
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

New update::

https://www.cs.toronto.edu/~juris/jlab/wcg.html

BOINC db node crashed. Thus, all running BOINC services, API services and message queues that need to talk to db01 die similarly; the connection is closed, although the node itself is still running.
10:38 ET: Crash recovery starting now. We should be able to restart all the services soon.

12:21 pm ET: crash recovery successful; bounced all services; restarted the feeder; should start to see work going out again.
----------------------------------------
[Edit 1 times, last edit by Hans Sveen at Mar 4, 2025 5:36:57 PM]
[Mar 4, 2025 4:12:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2139
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Well, it is what it is. Everything else is history.
A nice cup of tea, is always appreciated. smile
[Mar 4, 2025 4:54:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 937
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

As per that last update, there is some work going out, but it all seems to be MCM1 retries at the moment -- I got two dozen between 17:30 and 17:45 UTC across three systems, and another six (across two) at 18:20, but my other machines haven't seen any at all! Most of the time a request gets "committed to other platforms" (suggesting a server buffer full of retries for a different O/S) or "no tasks available"...

I hope this is just a feature of the order in which systems are recovered, rather than an indication of further problems :-)

Cheers - Al.
[Mar 4, 2025 7:27:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 818
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Hi!
I just got some ARPs, from gen 141 and not resends, so new work is coming!

Hans
[Mar 4, 2025 8:06:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 265
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Hi Al,
I started getting new MCM1 work just before 20:00 UTC.
Cheers,
Mark
[Mar 4, 2025 8:13:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 937
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Thanks must go out to "Tech Team" -- I just wish they had some reliable hardware to work with :-)

I also started getting new MCM1 work at about 20:15 UTC. Unlike Hans, however, I've only got one new ARP1 (at about 20:50), though it is a generation 131 task, so I should be grateful! (Cell 34392 -- I also processed this one at generation 112 on 2025-01-28 so that's moved along reasonably well.)

Cheers - Al.
[Mar 4, 2025 11:12:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 792
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

The end-of-day stats run for March 4 didn't run, and generations.txt and state.txt are blank screens.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

[Mar 5, 2025 5:34:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 937
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

The end-of-day stats run for March 4 didn't run, and generations.txt and state.txt are blank screens.
I've picked up today's generations.txt and interpolated the contents of the missing file with a little script I have that looks at the unit movements in and out of each generation; it can deduce what the "units in generation" counts should be by working backwards, then it can deduce the missing completed unit data by working forwards from the last valid file. (Of course, this only works if there's only one day to fill in!)

I then ran my normal daily ARP1 activity script with the constructed file and they found no inconsistencies...

In case anyone wants the numbers, here's what my activity script reported about activity on the last two days -- hopefully, Mike Gibson won't think I'm trying to muscle in on his reporting territory :-)

2025-03-04:
3 units from 128 to 129
2 units from 129 to 130
1 unit from 130 to 131
2 units from 131 to 132
1 unit from 134 to 135
2 units from 137 to 138
12 units from 138 to 139
59 units from 139 to 140
149 units from 140 to 141
432 units from 141 to 142
----
663 units in all

2025-03-05:
1 unit from 127 to 128
4 units from 129 to 130
3 units from 130 to 131
2 units from 131 to 132
2 units from 132 to 133
2 units from 134 to 135
6 units from 135 to 136
2 units from 136 to 137
5 units from 137 to 138
12 units from 138 to 139
58 units from 139 to 140
193 units from 140 to 141
964 units from 141 to 142
----
1254 units in all

Cheers - Al.

P.S. If Adri or Mike (or anyone else) sees this and has a genuine copy of the 2025-03-04 generations.txt I'd be happy to get confirmation (or otherwise) of the accuracy of the above :-)
[Mar 5, 2025 10:36:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2148
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

P.S. If Adri or Mike (or anyone else) sees this and has a genuine copy of the 2025-03-04 generations.txt I'd be happy to get confirmation (or otherwise) of the accuracy of the above :-)

The only genuine copies of yesterday's files that I have, I'm afraid, are empty, Al. devilish

Adri
----------------------------------------
[Edit 1 times, last edit by adriverhoef at Mar 6, 2025 12:03:34 AM]
[Mar 6, 2025 12:02:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 102   Pages: 11   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread