Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 265
|
![]() |
Author |
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2174 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
To understand what is going on, I found this:
"You have a high chan[c]e to run into soft lockups if you have no CPU cycles available. When I/O intensive tasks run, most of your CPU cycles are blocked contending to get an ack for the write() call." http://lists.openstack.org/pipermail/openstack/2015-January/011089.html I'm hoping that fixing the problem would be as easy as this: https://unix.stackexchange.com/questions/354368/nmi-watchdog-bug-soft-lockup |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Funny, hit on exact same thread at SE somewhat earlier... hope it is indeed a slap to the forehead when pointed.
----------------------------------------[Edit 1 times, last edit by SekeRob* at Jun 26, 2017 12:48:29 AM] |
||
|
littlepeaks
Veteran Cruncher USA Joined: Apr 28, 2007 Post Count: 748 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the timely info Kevin. I was worried that the problems were on my end. My ISP sent me an email a few days ago, telling me to reboot my gateway for a huge speed increase -- thought it might have had to do with that.
|
||
|
bluestang
Senior Cruncher USA Joined: Oct 1, 2010 Post Count: 272 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Shout out to all men and women hard at work trying to figure this out, especially during the weekend.
---------------------------------------- |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1960 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
But it was the storage cluster that failed. I am aware of that. But my statement still stands...There is no "on the fly" if there aren't any servers in the storage cluster running. If they went down hard and the filesystem wasn't unmounted cleanly, it will have to be checked and then ALL the nodes (servers in the cluster) will need to see the same consistent view of the filesystem and that is just the storage cluster backend. What the cluster presents to the other servers could be a whole different thing. That is supposed to be the advantage of clustered storage servers and the associated distributed file systems. You will get a degraded performance while all nodes in the cluster get "back on the same page", but there is no waiting for a file system check or such as with other/older/non-distributed file system...Ralf ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
how was the eod updates run, if the system is down?
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Are you guys running this on ZFS, or something less reliable?
|
||
|
Patrickkellysyduni
Cruncher Joined: Feb 21, 2016 Post Count: 1 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Does that mean no upload and downloading of tasks, all taks are done but are stuck:
|
||
|
bfborden
Cruncher Joined: Sep 9, 2009 Post Count: 1 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My tasks are still stuck trying to upload results. Is there an ETA?
|
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1960 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Are you guys running this on ZFS, or something less reliable? I can't find the post/announcement right now, but there used to be a scheduled downtime a (couple of?) year(s) ago for the very purpose of moving the databases to ZFS...Ralf ![]() |
||
|
|
![]() |