Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2379 times and has 8 replies Next Thread
WCGAdmin
World Community Grid Admin
Joined: Jun 9, 2020
Post Count: 168
Status: Offline
Reply to this Post  Reply with Quote 
Planned Maintenance on Tuesday, November 2 (Completed)

We are replacing two failed disk drives and performing some database maintenance activities.

https://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=744
----------------------------------------
[Edit 1 times, last edit by caitilarkin at Nov 2, 2021 5:33:20 PM]
[Oct 28, 2021 2:20:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
F4UCorsair
Cruncher
Joined: Feb 3, 2009
Post Count: 7
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

You're waiting 5 days to replace failed drives? So you're telling us that your servers and/or storage systems don't support hot swapping?
[Oct 28, 2021 6:29:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

The drives are part of shared nothing filesystem (IBM GPFS FPO). We have 55 drives in all and only 2 of them have failed. Many more would need to fail before I became concerned about a loss of data.

As for why we cannot perform a hot swap, I have been inquiring about that. The servers support it, but when I opened the ticket to get the failed disks replaced I was told that it wasn't possible. So I am trying to get an answer as to what is going on, but in the meantime we are getting the disks replaced while we do the database change since we would be down during that time anyway.
----------------------------------------
[Edit 1 times, last edit by knreed at Oct 28, 2021 7:59:15 PM]
[Oct 28, 2021 7:58:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BobbyB
Veteran Cruncher
Canada
Joined: Apr 25, 2020
Post Count: 602
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

After being down for 6 hours, I imagine that when the system comes up again it will get hammered by all those Boinc clients as they start their up/downloading. Will it handle all that?

Guess we should make sure our machines "load up" before the outage.
[Oct 29, 2021 5:00:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 858
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

will the forum stay up during the maintenance? I'm assuming not since it says the website will be down
[Nov 1, 2021 3:24:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
caitilarkin
Former World Community Grid Admin
USA
Joined: Nov 4, 2015
Post Count: 331
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

will the forum stay up during the maintenance? I'm assuming not since it says the website will be down


The forum is generally down for all or part of the maintenance window.
[Nov 1, 2021 3:42:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

This work has been completed and we are catching up on the backlog.

As for BobbyB's question. the peak load once we started up hit our configured limits for about 7-8 minutes (i.e. we hit the limit of 300 concurrent uploads) and during that time we were receiving about 1.1 Gbps of data during that time. The load is now rapidly dropping back to normal levels and users should see no issues uploading or downloading work*


* The one exception is that we are currently processing all of the results returned which means that we are going to generate 6 hours worth of resends in 20-40 minutes. As a result over the 30-40 mintues there will be times where users have trouble getting assigned work during a scheduler request because we might be clogged with work that needs a reliable host on a given platform. This will clear quickly and automatically and we should be fully back to normal soon.
[Nov 2, 2021 5:06:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BobbyB
Veteran Cruncher
Canada
Joined: Apr 25, 2020
Post Count: 602
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

Out of curiosity, at what time did the system come back up?
My first upload was at:
2021-11-02 12:49:40 | World Community Grid | Started upload of MCM1_0183919_4068_0_r1718798455_0

I am at UTC-4
[Nov 2, 2021 10:11:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance on Tuesday, November 2

It came back up about 16:45 UTC. So you uploaded about 4 minutes after it became available again.
----------------------------------------
[Edit 1 times, last edit by knreed at Nov 2, 2021 10:19:36 PM]
[Nov 2, 2021 10:19:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread