Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
Member(s) browsing this thread: AgrFan , Unixchick
Thread Status: Active
Total posts in this thread: 157
Posts: 157   Pages: 16   [ Previous Page | 7 8 9 10 11 12 13 14 15 16 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7576 times and has 156 replies
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12364
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

One of my machines now has spare threads!

Mike
[Mar 30, 2025 1:29:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7663
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Now it has changed to "feeder not running"

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Mar 30, 2025 3:23:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 954
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Thanks for all the updates.

Database issues on the weekend are the worst. Hard to know if it is due to a software or hardware issue. They have had issues with both lately.
[Mar 30, 2025 3:35:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12364
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

All reported but no downloads.

Mike
[Mar 30, 2025 8:39:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12364
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

A few downloads.

Mike
[Mar 30, 2025 9:19:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 954
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I've gotten some MCM WUs, but I haven't gotten any ARP WUs yet.

Big thanks to the tech team for fixing the issues so quickly on a Sunday morning.
[Mar 30, 2025 9:24:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 953
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Approximate timeline for the latest service break, based on my failed API calls and BOINC logs ...

  • Database server went away between 05:55 and 06:00 UTC on 2025-03-30.
  • Scheduler started reporting "feeder not running" around 08:30 UTC
  • Feeder restarted around 19:4518:45 UTC -- only retries available
  • New work started to appear about an hour later -- lots of transient HTTP errors at first
The upload servers appeared to be available for the duration, but there would have been a large backlog of tasks waiting to be reported when the scheduler woke up again!

The HTTP errors eased after less than 30 minutes -- I suspect that the majority of hosts that needed to re-fetch the [big] MCM1 master file (which is still not a sticky file) had succeeded by then!

At present, WCG is on UTC-4 (EDT), so they were working on this on a Sunday :-) -- many thanks for the effort (this former tech person appreciates what weekend working actually means! [Edit: - I see Unixchick agrees; overlapping posts!])

Cheers - Al.

P.S. This outage was an almost exact match for the one on 2025-03-04; I'm relieved (and perhaps a little surprised) that we haven't seen a repeat until now :-)

[Edit: forgot to correct one of the times for daylight saving!]
----------------------------------------
[Edit 2 times, last edit by alanb1951 at Mar 30, 2025 9:56:49 PM]
[Mar 30, 2025 9:32:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TLD
Veteran Cruncher
USA
Joined: Jul 22, 2005
Post Count: 804
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

The tech team has been doing a great job coming in on the weekend lately.
----------------------------------------

[Mar 30, 2025 9:33:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2165
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I made it through the outage, with work still in the cache. But only because I upped my cache before the weekend.
[Mar 30, 2025 11:23:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2156
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Al said:
Approximate timeline for the latest service break, based on my failed API calls and BOINC logs ...

  • Database server went away between 05:55 and 06:00 UTC on 2025-03-30.
  • Scheduler started reporting "feeder not running" around 08:30 UTC
  • Feeder restarted around 19:4518:45 UTC -- only retries available
  • New work started to appear about an hour later -- lots of transient HTTP errors at first

That's (2025-03-30T19:45:19) exactly at the same time when 150 ARP1-workunits were distributed. Coincidentally, I caught three of them, that's 2%. biggrin It took one of my devices 7 minutes to download (all of the files from) one of the tasks (with transient HTTP errors; normally it takes 3 minutes without HTTP errors).

Adri
PS Thanks to the team for resolving another outage. smile
[Mar 31, 2025 12:41:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 157   Pages: 16   [ Previous Page | 7 8 9 10 11 12 13 14 15 16 | Next Page ]
[ Jump to Last Post ]
Post new Thread