Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2399 times and has 8 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
How did the server 'crash'

If i understand correctly the faah has an error that made WU unavailable and then the crash.
Could a community advisor or a tech please help me understand.


Please do not respond to this post
----------------------------------------
[Edit 4 times, last edit by Former Member at Jan 8, 2011 2:14:02 AM]
[Oct 15, 2010 10:58:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

If i understand correctly the faah has an error that made WU unavailable and then the crash.
Could a community advisor or a tech please help me understand.
Someone may choose to answer your question, but personally I hope not. The bank does not tell its creditors the whys and wherefores of every system failure. Its on a need to know basis and the crunchers are not employees. IMHO
[Oct 16, 2010 5:15:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

echooff,

One had about zero.zeroooooooooooo to do with the other. Think if you read the 2 cases, faah had a code line missing for the AutoGrid portion of AutoDock and WU's running fine, you can see that clearly.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 16, 2010 5:42:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
martin64
Senior Cruncher
Germany
Joined: May 11, 2009
Post Count: 445
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

Its on a need to know basis and the crunchers are not employees. IMHO

astrolab,

that's exactly the reason why I think it should be the opposite. If I am an employee, I will have to accept the company's rules, whatsoever. But at WCG we are talking a bout volunteers, and I think they deserve to be informed about what happens to their valuable computer time. I think we have been extremely well informed by knreed during the outage.

echooff,

as Sekerob pointed out, the 2 cases have nothing to do with each other, pure coincidence. Things like that happen, be it humans that make mistakes (quite normal for people doing something, with the biggest mistake being to do nothing...), or systems that fail. You cannot build any technical system that has exactly 0% probability to fail. My big respect to the WCG techs and researchers for the fact that these things only happen so rarely on such a complex system. applause

Regards,
Martin
----------------------------------------

[Oct 16, 2010 7:36:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

that's exactly the reason why I think it should be the opposite. If I am an employee, I will have to accept the company's rules, whatsoever
But see you don't. Same as a volunteer, you get to decide whether to provide your services based on the information that management DECIDES to provide.
But at WCG we are talking about volunteers, and I think they deserve to be informed about what happens to their valuable computer time
Valuable computer time. My investment is trivial compared to the multi-million investment by IBM. Trivial. You may have missed my post about how WCG is not a democracy.
I think we have been extremely well informed by knreed during the outage
and I too think we have been informed as much as we need to be
[Oct 16, 2010 3:36:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TimAndHedy
Senior Cruncher
Joined: Jan 27, 2009
Post Count: 267
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

If i understand correctly the faah has an error that made WU unavailable and then the crash.
Could a community advisor or a tech please help me understand.


Ignore the Cranky Old Men. It's good to try and understand these things.
[Oct 22, 2010 2:31:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

The Cranky Old Men had an answer from a Grumpy Old Man right at the top.

FAAH task failings had nothing to do with the file system "incongruity"... a perfect system where the safety detected corruption, so it went in "read only" state in self preservation.

-- For FAAH, see post by Dr. Perryman
-- For File-system fail, see posts by knreed who worked with a SE (System Engineer) tirelessly to bring the system back online, which was as he eluded a full fsck and then a full copy over to a different storage area, so the suspect array could be taken out. In that process an undetermined number of pre-feeder tasks were found to be dodgy which were flushed out by method of letting them go through and have the clients on receipt ditch them. No returned work by donateurs was lost.

The Grumpy Old Man read it all in the frequent updates provided to the volunteers. ;-)

-- SekeRob
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 22, 2010 3:14:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TimAndHedy
Senior Cruncher
Joined: Jan 27, 2009
Post Count: 267
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

You were not my first thought when I posted that but, you can qualify at times.

Your post was mixed in with the other, more obnoxious posts, so please don't be offended.

I don't like to see peoples questions shot down in that manner.

Anyway, I doubt everyone here has administered Linux/Unix systems so what happened may not be obvious to them.
[Oct 23, 2010 5:45:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: How did the server 'crash'

TimAndHedy,

Sometimes we (all inclusive) start writing replies and then we hit the reset button. Think we ALL can do with that approach, or as the email practice by many is: Save as draft first and let it sit before hitting the send button... when comments strike nerves.

Happy crunching.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 23, 2010 6:57:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread