| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 72
|
|
| Author |
|
|
keithhenry
Ace Cruncher Senile old farts of the world ....uh.....uh..... nevermind Joined: Nov 18, 2004 Post Count: 18667 Status: Offline Project Badges:
|
Sorry, MCM is still disabled. We will get it back up and running as soon as we can Thanks you for your patience, -Uplinger Keith, suggestion - when you you do reenable MCM1, let resends go out first until all of those are sent (if you don't already do this is circumstances like this). I'm seeing a number of WU's with No Reply's and Waiting to be sent's. That should help minimize the number of WUs that you may have to manually intervene on wrt credit issues. |
||
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
Hey Keithhenry,
Thanks for the suggestion. But the default mechanism is set to have resends pushed to the top of the priority list. This will allow those results to go out first. They will be getting sent to reliable hosts only so it might take a bit to get back to normal work units since we will be running the project slower to start as well as a limited number of reliable hosts. This means that some machines will not get new work even though work is flowing again. I will be cutting the priority in the feeder to 1/10th the size it was when we hit the server outage. This does not mean that the actual results per day will be 1/10th, because the feeder gets refilled every 5 seconds. Technical example: So before say we had 100 slots, and within that 5 seconds only 50 on average were claimed. This means the feeder only needed to refill with 50. By cutting the slots down to 10 doesn't change that there are 50 requests within that 5 second range. Thus the project will appear like it's running at 1/5th the speed. @S_MDC_PROJECT, As you can see through our forums, we have stopped various projects temporarily over the past 10 years. CEP and FAAH have had technical reasons for being stopped as well. The reason we chose MCM over those others was due to the result size of the files for MCM coming back. We never like suspending a project for any reason. Technical example: You have 2 projects running at once. They both average 5 hours of runtime per result (to make the example easy). Project A has result files of 1MB, Project B has results of 10MB. Both projects return the same number of results per day, and the system is happy and sitting at 75% total storage use. This means that project A is using say 10% of the total storage while project B is using 65% of the total storage. Then project B starts having result files of 100MB. Even if you choose project B and it gets 100% of the grid time, you're looking at using 650% of your total storage when it gets to a steady state. But if you have project A run at 100% of the grid time you're using only 20% of your total storage. Please note that the numbers are all just examples, and our goal is to have all projects running at the same time. Can they all get equal share, that is something we will be striving towards. Thanks, -Uplinger |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Uplinger,
Thanks a lot for the update. Your assignment of limited resources is sound and your patient & diplomatic posts are much appreciated. Best of luck resolving this resource issue. |
||
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
I have re-enabled MCM just now. Again, this will mean that only reliable hosts will be getting resent work for the time being. You may see messages like, 'No work available for Mapping Cancer Markers' or 'Work is available but assigned to different host type'.
Also, we are in discussions with the researchers on how to prevent this from being a problem in the future. They will be taking a look at the batches that were larger than usual to see if they can prevent them from happening in the future. Thank you for your patience, -Uplinger |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Excellent news, Keith. May I once again congratulate you and your colleagues for working so diligently to resolve this with the minimum impact to the volunteers. Thank-you!
|
||
|
|
cjslman
Master Cruncher Mexico Joined: Nov 23, 2004 Post Count: 2082 Status: Offline Project Badges:
|
Keith... thanks for the info (along with the excellent diplomatic tone). Appreciate you and your team's dedication
---------------------------------------- .CJSL Gotta keep crunching... there's a world to save !!! |
||
|
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 3010 Status: Offline Project Badges:
|
Excellent job well done - all my 4 cores are busy again
----------------------------------------![]() ![]() |
||
|
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges:
|
... they have no right to down grade the MCM1 just because they cannot handle the responce from the participants Maybe if you give it some thought you can find a better reason? |
||
|
|
tmedve
Senior Cruncher USA Joined: Nov 16, 2004 Post Count: 191 Status: Offline Project Badges:
|
Thanks for the hard work you all put into getting things running again. My computer is busily crunching.
|
||
|
|
Deluxe_Cabinets_And_Granite
Veteran Cruncher Joined: Oct 27, 2008 Post Count: 939 Status: Offline Project Badges:
|
Thanks for all you hard work over the past few days Keith!
----------------------------------------![]() |
||
|
|
|