Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 10
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3407 times and has 9 replies Next Thread
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Planned Maintenance - Tuesday, July 7th 15:00 UTC [Completed]

We will be apply the latest updates to our Linux servers tomorrow. These updates will require a reboot of the servers. As a result we will be having periodic outages of the website and BOINC grid during these updates. It should take about 4 hours to complete the updates.
----------------------------------------
[Edit 1 times, last edit by knreed at Jul 9, 2009 3:41:06 PM]
[Jul 6, 2009 4:37:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

This update is taking longer than expected and is rougher then expected. We have had some interesting issues with stale NFS mounts locking up entire systems and crashing the web servers. We are working through this now.
[Jul 7, 2009 9:54:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

We are now waiting for the servers to complete a fsck on the storage. This will take a bit of time.
[Jul 7, 2009 10:21:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

The BOINC grid is now back on-line.
[Jul 8, 2009 1:21:42 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

Another update for the members. We are still performing fscks on one of the back-end servers. This server has 2 2.5 TB storage partitions on it and is the main location for working with the research data before and after it has been run on the grid. The fscks that were run yesterday (and late into the night) reported that there were errors found on the file systems (incorrect inode counts). This is not an unusual occurrence for large, heavily used file systems (i.e. we do not expect to have lost any data). However, it does mean that we need to re-run the fscks in correction mode. This will take another long bit of time.

We have designed the system so that this back-end server can be offline for somewhere between 36 and 48 hours. We estimate that we have about 15 hours before we hit this window. At that time we will run out of work to distribute to the members.

We expect to be able to complete the fsck prior to that time and load new work into BOINC.

We will continue to keep the membership up to date about our progress.
[Jul 8, 2009 1:37:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

We expect to be able to start loading new work into BOINC via our standard methods within 3-4 hours. However, in case things do not go as expected we are setting up some alternative arrangements.

The Help Conquer Cancer project has a lot of work at their site ready for us to process. We are setting up an alternate route to load this work into BOINC. We are going to increase the distribution weight of Help Conquer Cancer as a result. This will extend our expected length of time that we have work available for all projects to send.

As soon as we are able to load additional work for all projects, we will revert back to normal weights and methods.
[Jul 8, 2009 2:50:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

The fsck is about 75% done so we have a bit more waiting to-do.
[Jul 8, 2009 7:04:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

Another update...

The scan on the disks are closer to being done, they have moved from stage 1 to stage 2 of the fsck which is good news.

The bad news is we have run out of work for CEP1 on the loader which means if you have CEP1 selected only you will receive a message saying no new work is available.

Thank you for being patient,
-Uplinger
[Jul 8, 2009 10:16:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

The fsck's have finally finished and we are loading additional work for all research projects. We are setting the weights of the projects back to normal.
[Jul 9, 2009 3:28:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned Maintenance - Tuesday, July 7th 15:00 UTC

This is finally over. Operations are now restored to normal.
[Jul 9, 2009 3:41:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread