Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 352
Posts: 352   Pages: 36   [ Previous Page | 16 17 18 19 20 21 22 23 24 25 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2178558 times and has 351 replies Next Thread
RCC_Survivor
Veteran Cruncher
USA
Joined: Apr 28, 2007
Post Count: 1337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Errors.


So what do we crunchers do for the meantime: limit our sync-to-WCG to once every 24hrs? That would seem to be a direct solution to regulate the traffic and thereby help ease congestion. Or, do we crunchers have to play it by ear?


andzgrid,
This is a good suggestion and would be glad to set my network access in Preferences to a time period when server load was at a minimum.

SekeRob or knreed,
Do you have any server load info that would help determine the hours when server load is heaviest and lightest?

I do not mind running a 2-day queue with network access restricted to a few hours a day. I have lost a lot of WUs since last month and will do whatever it takes to reduce the losses.

Am I wrong or did this problem start after they did some recent software upgrades?
----------------------------------------
Be kinder than necessary, for everyone you meet is fighting some battle.

Please join the team The survivors hugs
Bilateral Renal, Melanoma, and Squamous Cell cancers
[Jul 27, 2012 11:18:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BSD
Senior Cruncher
Joined: Apr 27, 2011
Post Count: 224
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Errors.

My devices are playing in someone else's yard for a while. sleep Hope the techs get it all sorted out. They are a busy bunch. coffee Cheers
[Jul 28, 2012 12:04:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Errors.

SekeRob or knreed,
Do you have any server load info that would help determine the hours when server load is heaviest and lightest?

I do not mind running a 2-day queue with network access restricted to a few hours a day. I have lost a lot of WUs since last month and will do whatever it takes to reduce the losses.

Am I wrong or did this problem start after they did some recent software upgrades?

Not likely to happen, but in the US of A / GB, where I estimate most Update / Retry Now hammerers to be, just follow the position of the sun and the moon and you have a pretty good idea when the bulk of button operators are not watching for you to sneak in those uploads. Yesterday had 305, day before 455, days before that 1, 10, 14, so I think things in autonomous mode are doing pretty good with the [fuzzy logic] scheme knreed has devised to respond to momentary overloads. Sure enough there are the random back-offs when things are too busy, but so far they all seem to recover after the deferral countdowns have run down, one / two / three times. Keep a fair cache 1.0 days and never a moment dry.

In all this, lost zero tasks.
[Jul 28, 2012 7:10:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Errors.

... and would be glad to set my network access in Preferences to a time period when server load was at a minimum.
Going forward, we need to first make the above manual step if only as preparatory to the next step: WCG-server to WCG-clientMachine M2M (machine-to-machine) communications. The data in a serverStatus (webpage) would facilitate the transition.
;
[Jul 28, 2012 4:16:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RCC_Survivor
Veteran Cruncher
USA
Joined: Apr 28, 2007
Post Count: 1337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Errors.

SekeRob or knreed,
Do you have any server load info that would help determine the hours when server load is heaviest and lightest?

I do not mind running a 2-day queue with network access restricted to a few hours a day. I have lost a lot of WUs since last month and will do whatever it takes to reduce the losses.

Am I wrong or did this problem start after they did some recent software upgrades?

Not likely to happen, but in the US of A / GB, where I estimate most Update / Retry Now hammerers to be, just follow the position of the sun and the moon and you have a pretty good idea when the bulk of button operators are not watching for you to sneak in those uploads. Yesterday had 305, day before 455, days before that 1, 10, 14, so I think things in autonomous mode are doing pretty good with the [fuzzy logic] scheme knreed has devised to respond to momentary overloads. Sure enough there are the random back-offs when things are too busy, but so far they all seem to recover after the deferral countdowns have run down, one / two / three times. Keep a fair cache 1.0 days and never a moment dry.

In all this, lost zero tasks.


(Edit - removed reference because of SekeRob's comment "not a good idea to mention".)

There are problems after sundown.
Based on a lack of info I will use SWAG and experiment until I get it right.
Shouldn't have to do this.

I am really surprised that the servers do not have a load balance/control system and the techs have to improvise.

Again, I think this problem started after the server/filesystem upgrades/updates in June.
It is difficult to troubleshoot intermittent problems when multiple changes were made in a short period.
Been there, done that.
When we had a problem it was discussed on the evening news and the next day's newspaper so we met/exceeded our 99.99% up time target because our paycheck was on the line.
If there was an outage we were required to write letters to upper management explaining why it happened and what we would do to prevent it in the future.
So I understand the difficulty and pressure involved in resolving service problems.
----------------------------------------
Be kinder than necessary, for everyone you meet is fighting some battle.

Please join the team The survivors hugs
Bilateral Renal, Melanoma, and Squamous Cell cancers
----------------------------------------
[Edit 1 times, last edit by RCC_Survivor at Jul 29, 2012 7:14:52 PM]
[Jul 28, 2012 7:33:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Errors.

Not a good idea to mention that, and has become largely superfluous, particular to those that do scheduled networking and crunching and a few more reasons [client 7.0.xx];), but *no*, this is separate from the upload saturation issue... the 1st part of the upload/reporting cycle. The RtR is part 2.

It needs no repeating that something in the upgrading path at the server side kicked this, and IBM now working intensely with the likes as Red Hat and hosting support to get to the root of the issue.
[Jul 28, 2012 7:50:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RCC_Survivor
Veteran Cruncher
USA
Joined: Apr 28, 2007
Post Count: 1337
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Errors.

Changed queue to 2.0 days and limit network access to 00:00-06:00 EDT (UTC-4) and experienced problems at 02:57 and 03:43 on separate PCs.
There may not be a time period that is free from the problem.
I feel I am having a "Popeil moment" and will "set it and forget it" as there are bigger fish to fry.
----------------------------------------
Be kinder than necessary, for everyone you meet is fighting some battle.

Please join the team The survivors hugs
Bilateral Renal, Melanoma, and Squamous Cell cancers
[Jul 29, 2012 7:33:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server Errors.

Hello RCC_Survivor,

https://secure.worldcommunitygrid.org/forums/...d,33316_offset,180#385932 and https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,33481 imply that there is no schedule. They isolate the file system on an as-required basis.

Lawrence
[Jul 29, 2012 10:53:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Bearcat
Master Cruncher
USA
Joined: Jan 6, 2007
Post Count: 2803
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Errors.

Think it's best to let boinc do it's thing and let the servers tell our computers when to check back. Seems to work its self out sooner or later.
----------------------------------------
Crunching for humanity since 2007!

[Jul 30, 2012 2:42:29 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server Errors.

The occurrence of this issue does not appear to correlate with load. As a result, we have not been able to predict when it will occur. We have been doing a lot of data collection this weekend that will allow for further analysis to hopefully find some clues to what is going on.

We have also figure out how to quickly detect that the issue is occurring in some of our backend processes such as our applications that we use to load new work into BOINC for distribution. We are using this to cause processes that are not volunteer facing to 'back-off' so that the system recovers quickly. We hope that this will significantly reduce the times when you are not able to upload/download work. We put this in place about 2 hours ago and we are watching to see what happens.
[Jul 30, 2012 3:04:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 352   Pages: 36   [ Previous Page | 16 17 18 19 20 21 22 23 24 25 | Next Page ]
[ Jump to Last Post ]
Post new Thread