Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 143
Posts: 143   Pages: 15   [ Previous Page | 5 6 7 8 9 10 11 12 13 14 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 164049 times and has 142 replies Next Thread
nort
Cruncher
Joined: Mar 10, 2005
Post Count: 8
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

IBM has done, and is doing, a remarkable job with WCG, and their technology has always been among the best in the world. Any carping is beneath dignity.

+1
Ok, my team is "IBM" so you would say I'm not objective. Anyway, when you see the whole infrastructure put in place, free of charge for all crunchers, and with efficient technology, I think it can be respected. Thousands of years in research for diseases have been saved with WCG, with you all, with me. Our kids and their kids will have less issues with their diseases, and for me that's enough to have full respect for IBM and this WCG initiative few years ago. Cheers.
----------------------------------------

[Jul 19, 2017 10:33:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pvh513
Senior Cruncher
Joined: Feb 26, 2011
Post Count: 260
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

Yes, IBM did a great job sponsoring WCG, but let's not loose track of reality here. When WCG was hosting the files in-house it worked more or less flawlessly for years. Since they have moved into the IBM cloud there have been two major melt-downs in short succession, the likes of which I cannot remember. I call that a big step back in reliability... And don't tell me this is a typical IT problem. It is not. We have a big 4 PB file server at my work place and it has been running completely reliably ever since it arrived.
[Jul 19, 2017 10:38:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

The filesystem is recovered and operational. We have enabled uploads, downloads and scheduler requests. Data is flowing in quickly.

Prior to restoring service, we updated the database to reset results that had been incorrectly processed during the outage (and more importantly we have added some scripts to prevent that start up from happening again). Also - we have extended deadlines to account for the delayed availability;.

What about the trickle handler?
[Jul 19, 2017 10:46:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MrHasselblad
Cruncher
Joined: Dec 20, 2014
Post Count: 42
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

Should be RESOLVED for most people now
[Jul 19, 2017 10:49:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Nickers
Cruncher
Joined: Jan 3, 2007
Post Count: 10
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

Well, I have seen completed work units get uploaded. However, I am getting no new ones and message indicates no work available for the three projects I am running. Perhaps all available units got immediately sucked from the queues. I am a bit concerned as nothing shows up in the results status with a return date of the 19th.
[Jul 19, 2017 11:10:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RTS48
Veteran Cruncher
Bolivia
Joined: Aug 2, 2009
Post Count: 1350
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

Hey - all uploaded, no Invalids but 370 WUs in PV Jail. I suppose you can't have everything.
Well done techs - our unsung heroes
----------------------------------------
Rod Peel
Santa Cruz
Bolivia
South America

,
,
[Jul 19, 2017 11:10:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 1951
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

After a long phone call with a client. I just noticed that things started to move again.

Will probably a while again until things are back to normal though.

Thanks Keith et.al. for fixing this again,

Ralf
----------------------------------------

[Jul 19, 2017 11:27:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crosster
Cruncher
France
Joined: Jul 30, 2011
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

Well, I have seen completed work units get uploaded. However, I am getting no new ones and message indicates no work available for the three projects I am running.

Same behaviour for me with 2 projects (MCM and OET)
Edit : Fixed a few minutes later smile
Thanks to all the team!
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by Crosster at Jul 19, 2017 11:36:46 PM]
[Jul 19, 2017 11:28:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

The filesystem is recovered and operational. We have enabled uploads, downloads and scheduler requests. Data is flowing in quickly.

Prior to restoring service, we updated the database to reset results that had been incorrectly processed during the outage (and more importantly we have added some scripts to prevent that start up from happening again). Also - we have extended deadlines to account for the delayed availability;.

Got an undocumented state in the API result pull, with returned results displaying for Pending Verification and Invalid having Outcome=0. Stopgap? The State for the Invalid on the RS pages is 'Other'... can understand that piece, but PVer, having ValidationState=4, I do not (The API lists 14 with return times well before the outrage, but find only 2 on the RS pages, suspecting they too have an 'Other' state, but there' just too many at this time to easily find them (For the cosmetics part, the RS pages have no filter option for Other)
[Jul 19, 2017 11:31:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TonyEllis
Senior Cruncher
Australia
Joined: Jul 9, 2008
Post Count: 261
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Scheduled Maint. July 18, 14:00 UTC, extended?

Problem might be over for some... but,,,

896 in PV
no work downloaded on some machines - none available - will start running dry
some machines with 'stuck' downloads (were in the process of downloading when WCG went down...)
WUs completed while WCG was down still sitting here waiting for upload on some machines

Thanks for the effort you guys at WCG have put in....
----------------------------------------
----------------------------------------
[Edit 1 times, last edit by TonyEllis at Jul 19, 2017 11:33:45 PM]
[Jul 19, 2017 11:32:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 143   Pages: 15   [ Previous Page | 5 6 7 8 9 10 11 12 13 14 | Next Page ]
[ Jump to Last Post ]
Post new Thread