Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 7
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1885 times and has 6 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC [COMPLETED]

Maybe there already was somewhere a 'discuss here' issues thread raised for this, if so I apologize for not checking, but, the good thing of this one http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,35973 compared to the full feeder scheduler off or no disk space is, that the result file upload server has continued to operate, so far, meaning if the servers are opened again, the only thing the clients need doing is fetch new work, if needed, and in the same flow clear the 'ready to report'. Then of course the validators will be doing a little overtime... to hopefully catch up by midnight.

(counting 10 Mississippi 11 Mississippi, at 1:15 into the maintenance Kevin wrote in to say we were about 25% complete... that'd make it 5 hours on the abacus...could be a little later, or he started the maint job half an hour after he wrote he'd started, which could then turn into the job finishing early)

Anyway, if you wish to discuss, put it below here, after these messages...

http://www.youtube.com/watch?v=TIcUs-6yW78

Crunch on!
----------------------------------------
[Edit 1 times, last edit by Former Member at Dec 14, 2013 10:55:23 AM]
[Dec 13, 2013 5:37:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC

My estimates weren't quite linear wink The system is available now.
----------------------------------------
[Edit 1 times, last edit by knreed at Dec 13, 2013 7:18:59 PM]
[Dec 13, 2013 7:18:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC

The database likes be cleaned up from time to time:

num_results	yrs_runtime_returned	minutes_since_returned
4394 1.31 1
4623 1.42 2
5463 1.76 3
5565 1.73 4
5319 1.64 5
6515 2.01 6
6262 1.79 7
6097 1.95 8
6174 1.97 9


These are the per minute results being reported back. Very few devices are getting high load messages during the recovery which is due to the database performing much better.

For those who are more technically minded we ran the 'Optimize table' command against the 'Result' and 'Workunit' tables. These are two of the larger tables in the database and are subject to significant inserts and deletes.
----------------------------------------
[Edit 2 times, last edit by knreed at Dec 13, 2013 9:10:04 PM]
[Dec 13, 2013 7:27:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Eric_Kaiser
Veteran Cruncher
Germany (Hessen)
Joined: May 7, 2013
Post Count: 1047
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC

Good work. Thanks.
----------------------------------------

[Dec 13, 2013 8:03:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cjslman
Master Cruncher
Mexico
Joined: Nov 23, 2004
Post Count: 2082
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC

Great Job... it's good to see the results so soon.

CJSL

Crunching for a better world...
----------------------------------------
I follow the Gimli philosophy: "Keep breathing. That's the key. Breathe."
Join The Cahuamos Team


[Dec 13, 2013 11:58:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC [Completed]

Most impressively, measuring by stats, the afternoon numbers with the 3.5 hour outage matched the morning numbers... 266 v 265 [yet (moi) had a boat load of wingman coming in the early hours of Saturday morning. One cause could be client bounces [the deferral counter after multiple in vain retries during the outage] and the BETA 7.28 pushed ahead by testers. Still got 2 running, making progress, into their 27th hour and at about 90%. Those are holding up 2 other half way run jobs [both have a pre-assigned wingman]. They'll get their turn today :D

Kevin, when will there be a controlled time back-off sent so that clients are on first connect during an out told to stay away to at least after a certain hour [together with the maint message "maintenance underway, your client will reconnect when expected to complete in nnn minutes"]? E.g. up that 182 seconds to n hours, and decrement that value as the optimization progresses. Maybe not worth the effort for a 4 hour programmed out [think you wrote some years ago about the exponential effect the longer it takes], but if it is of interest.

Crunch On.
[Dec 14, 2013 10:54:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Database Maintenance - No work sent or received on Dec 13 from 16:00 UTC - 20:00 UTC [Completed]

It would be nice to add, but haven't had the opportunity to prioritize the task. At the moment the list is getting longer so it will be sometime before this can be implemented.
[Dec 17, 2013 5:57:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread