Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 19
Posts: 19   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3478 times and has 18 replies Next Thread
Steve W
Advanced Cruncher
Joined: Dec 9, 2005
Post Count: 110
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Reminder that there is system maintenance this weekend on 12th - which could last upto 20 hours. shock

Details here.
[Jul 11, 2014 10:24:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
branjo
Master Cruncher
Slovakia
Joined: Jun 29, 2012
Post Count: 1892
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Thanks for reminder biggrin

Cheers
----------------------------------------

Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006

[Jul 11, 2014 11:44:45 AM]   Link   Report threatening or abusive post: please login first  Go to top 
cjslman
Master Cruncher
Mexico
Joined: Nov 23, 2004
Post Count: 2082
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Thanks Steve for the reminder / heads up.

CJSL

Crunching for a better world...
----------------------------------------
I follow the Gimli philosophy: "Keep breathing. That's the key. Breathe."
Join The Cahuamos Team


[Jul 11, 2014 12:02:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Principally running all project with zero share so not to get any buffered work, just enough to occupy the active cores. For the outage occasion upped the share to 0.1 and increased buffer to 1 day. What it demonstrated, without touching buttons, was that the limit is 30 work units per call, not 3 as someone was complaining, and in back-off interval of 2:01 minutes receiving a total of 90. The log censored extract below shows that initially 674k seconds was requested and 152k seconds was sent. Then 522k seconds was asked and 148251 was sent, for another 30 tasks and finally 375k seconds were honored with 342k seconds for another 30 tasks. With this only being mcm, how estimated runtimes managed to swing that much within 4 minutes is the wonders of boinc.

2751017 Default

10215 World Community Grid 7/11/2014 3:32:24 PM Sending scheduler request: To fetch work.
10216 World Community Grid 7/11/2014 3:32:24 PM Requesting new tasks for CPU
10217 World Community Grid 7/11/2014 3:32:24 PM [sched_op] CPU work request: 674267.70 seconds; 0.00 devices
10218 World Community Grid 7/11/2014 3:32:29 PM Scheduler request completed: got 30 new tasks
10219 World Community Grid 7/11/2014 3:32:29 PM [sched_op] Server version 701
10220 World Community Grid 7/11/2014 3:32:29 PM Project requested delay of 121 seconds
10221 World Community Grid 7/11/2014 3:32:29 PM [sched_op] estimated total CPU task duration: 152015 seconds
10285 World Community Grid 7/11/2014 3:34:34 PM [sched_op] Starting scheduler request
10286 World Community Grid 7/11/2014 3:34:34 PM Sending scheduler request: To fetch work.
10287 World Community Grid 7/11/2014 3:34:34 PM Requesting new tasks for CPU
10288 World Community Grid 7/11/2014 3:34:34 PM [sched_op] CPU work request: 523022.55 seconds; 0.00 devices
10289 World Community Grid 7/11/2014 3:34:39 PM Scheduler request completed: got 30 new tasks
10290 World Community Grid 7/11/2014 3:34:39 PM [sched_op] Server version 701
10291 World Community Grid 7/11/2014 3:34:39 PM Project requested delay of 121 seconds
10292 World Community Grid 7/11/2014 3:34:39 PM [sched_op] estimated total CPU task duration: 148251 seconds
10293 World Community Grid 7/11/2014 3:34:39 PM [sched_op] Deferring communication for 00:02:01
10294 World Community Grid 7/11/2014 3:34:39 PM [sched_op] Reason: requested by project
10357 World Community Grid 7/11/2014 3:36:44 PM [sched_op] Starting scheduler request
10358 World Community Grid 7/11/2014 3:36:44 PM Sending scheduler request: To fetch work.
10359 World Community Grid 7/11/2014 3:36:44 PM Requesting new tasks for CPU
10360 World Community Grid 7/11/2014 3:36:44 PM [sched_op] CPU work request: 375404.12 seconds; 0.00 devices
10361 World Community Grid 7/11/2014 3:36:48 PM Scheduler request completed: got 30 new tasks
10362 World Community Grid 7/11/2014 3:36:48 PM [sched_op] Server version 701
10363 World Community Grid 7/11/2014 3:36:48 PM Project requested delay of 121 seconds
10364 World Community Grid 7/11/2014 3:36:48 PM [sched_op] estimated total CPU task duration: 341573 seconds
10365 World Community Grid 7/11/2014 3:36:48 PM [sched_op] Deferring communication for 00:02:01
10366 World Community Grid 7/11/2014 3:36:48 PM [sched_op] Reason: requested by project

After some agent rumbling, version 7.3.18, now have 1:12 days per thread, but given the short batches in between, probably not too much to bridge the outage. The agent has 24 hours to figure it out and either increase or decrease before the 'project is offline for maintenance'.

Will set buffer back to 0.00 as soon as the project goes offline. If then it's not enough till wcg returns, other backup project will start being asked. Of course, if 90 complete before return, no more work will be requested from wcg anyhow, since the number of outstanding uploads will disable work requesting till that succeeds, or number of open uploads get's below computing threads times factor 2.

my 2 bolivars
[Jul 11, 2014 2:04:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

A little bit of an update. The change will cause various parts of the website to go down. The main part which everyone is concerned about is the downloading and uploading of new work. This will be down the longest. The website is probably going to be down for the time it takes to reboot a server, so probably 15 minutes during this entire maintenance window.

Basically if you don't want your computers to run dry, I would suggest bumping up your caches to 1 day.

I will try to keep everyone up to date on how things are progressing as this is a major outage, but in the long run will make our system run smoother.

Thanks,
-Uplinger
[Jul 11, 2014 3:11:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Thanks for info/advice Keith biggrin
[Jul 11, 2014 3:23:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
branjo
Master Cruncher
Slovakia
Joined: Jun 29, 2012
Post Count: 1892
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

During the latest outage I got this type of error (no runtime, just came in and went out):

Result Name: MCM1_ 0005658_ 0023_ 0--


<core_client_version>7.0.65</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>MCM1_0005658_0023_MCM1_0005658_0023.txt</file_name>
<error_code>-224</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>

</message>
]]>

----------------------------------------

Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006

----------------------------------------
[Edit 1 times, last edit by branjo at Jul 11, 2014 7:44:50 PM]
[Jul 11, 2014 7:43:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Branjo, that sounds like you got unlucky and tried to download the file when the http server was offline. There was about a 15 minute window when the http servers were completely offline. I would doubt it was a file that was downloaded partially because those txt files are relatively small if my memory serves me correctly.

We will be gracefully shutting down the http servers this time for the change, the main website should stay up.

FYI, the change window will start in a little over 1 hour. Fill your buffers.

Thanks,
-Uplinger
[Jul 11, 2014 11:48:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
branjo
Master Cruncher
Slovakia
Joined: Jun 29, 2012
Post Count: 1892
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Planned System Maintenance - 6th & 12th July

Thanks uplinger - I had enough WU's in queue so there was not a problem for me raised eyebrow

Cheers peace
----------------------------------------

Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006

[Jul 12, 2014 7:15:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 19   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread