World Community Grid - View Thread

World Community Grid Forums

Category: Support

Forum: BOINC Agent Support

Thread: Is the server down?

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 17

[ ]

Author

This topic has been viewed 1689 times and has 16 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Is the server down?

Hi,

Since yesterday I keep getting errors from the server saying it's getting no response from the server at all in terms of headers or anything, this all happened half way during some result uploading yesterday and I'm wondering whether anyone else is getting no response from the server like this with no headers ir anything like this?

[Feb 2, 2010 9:50:21 AM]

toss
Senior Cruncher
New Zealand
Joined: Jan 3, 2007
Post Count: 220
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

180 day badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

2 year badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

1 year badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

10 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

10 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

2 year badge for OpenPandemics - COVID-19


Re: Is the server down?

https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,28397

[Feb 2, 2010 9:56:59 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Is the server down?

Yesterday, could have been just before midnight depending on where you are on the planet. See the various other threads, What Happened, Server Status and the Outage message by knreed... the whole WCG net operation was off-line for over 6 hours. If you have files to upload in the Transfers window, select one and hit the Retry Now button.

Off-line the nightly statistics updated perfectly, with lots up job reports that did not make it in, so they will be included today.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Feb 2, 2010 9:58:35 AM]

jmcgaw
Advanced Cruncher
US
Joined: Feb 2, 2007
Post Count: 54
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

1 year badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

180 day badge for Influenza Antiviral Drug Search

5 year badge for Help Cure Muscular Dystrophy - Phase 2

45 day badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

50 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

10 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

20 year badge for OpenPandemics - COVID-19


Re: Is the server down?

Yeah, I got bitten by the same thing. crying

Even worse, my most productive machine "BEAST" is in its little room in the basement and only gets looked at occasionally using remote desktop. When I looked it was saying that all the tasks had been done, none were running, and that communications were deferred for 6+ hours. The latter seems to be a truly dumb thing to write into a program. Maybe defer for 10 minutes or for some short random period but six hours is entirely too much. BEAST can chew through 8-10 work units in that time. My two other fast machines were at least not stuck with 6 hours -- just 3 which is at least better.

I did jack up my queue of spare work units to allow a bit of work to get done in the future if things go wonky again. 1 day now. Maybe it should be longer than that though.

Maybe some mechanism needs to be added to have the clients notify us by email if things like this go wrong. Should be doable -- other programs manage it.

[Feb 3, 2010 1:17:58 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: Is the server down?

I have 6 machines that are down until I can get to them. Its a shame to have 10 cores down for no discernible or preventable reason

[Feb 3, 2010 5:19:49 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Is the server down?

What client version? The deferrals are incremental so if there was a curve-ball, 6 hours is possible, where when the server contact proofed to be impossible.

The price to the grid was dear. Tuesday morning was 115 years instead of ~est.155 years with already an est. shortfall for Monday of 25 years (the fail started on Monday). The afternoon session came back to 142 another 12-13 short, so in all 78+ years gone, irredeemably. So with better logic in the upcoming client and bits like zero weight backup project function to prevent a client going truly idle, it's up to the techs and the developers to see how what else needs tweaking. Certainly if projects have hundreds of thousands of devices active and coming back up, you'd not want them all connect within the hour of coming online again... that would cause another crash for sure.

edit: many of the deferred contacts reported in on Wednesday morning turning in an extra 22 CPU years over last weeks same day morning session, 178 now versus 156 last. Maybe there's another little kick this afternoon from those that had been running on cache.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Feb 3, 2010 12:47:44 PM]

[Feb 3, 2010 6:52:31 AM]

Decrypt74
Cruncher
France
Joined: May 20, 2009
Post Count: 36
Status: Offline
Project Badges:

20 year badge for Human Proteome Folding - Phase 2

14 day badge for The Clean Energy Project

20 year badge for Help Fight Childhood Cancer

100 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for Drug Search for Leishmaniasis

20 year badge for GO Fight Against Malaria

200 year badge for Mapping Cancer Markers

10 year badge for Uncovering Genome Mysteries

20 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

50 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: Is the server down?

Most of mine was deferred by more than 17h sad

Misconfiguration on my side ??

----------------------------------------

[Feb 3, 2010 9:12:42 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Is the server down?

Since my client is attached to the internet 24/7, I've set the connect interval to 0.00 days. When I looked in about 8 hours into the crash the deferral counter was at 31 minutes, but that's with the 6.10.29 alpha client. As WCG is preparing and testing 6.10, I've asked to look into the logic to see if that needs further tweaking. It may be different, improved over the present 6.2 client that is still the house standard download. Can't find ready digestible documentation that explains the deferral states, how far they increment and so on. In past seen it was going over 24 hours if a project continued to be down / unreachable.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Feb 3, 2010 9:26:48 AM]

Decrypt74
Cruncher
France
Joined: May 20, 2009
Post Count: 36
Status: Offline
Project Badges:


Re: Is the server down?

my clients are also attached 24/7, and i've set connect interval to 0 and additional work buffer to 0.05.

my windows clients are from 6.6.36 to 6.10.18 and 6.4.5 for linux

----------------------------------------

[Feb 3, 2010 10:17:20 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: Is the server down?

If you wish to make sure to be able to bridge downtimes set it back to the default 0.3 days additional buffer (local prefs) or cache on the web device profile. Think many run with 0.5 to 1.00 days and more looking at the wingmen validations. 1.00 is also the longest period that the client will try to send the Ready to Report jobs, where the result files are always uploaded ASAP if there is a ready connection.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Feb 3, 2010 12:09:12 PM]

[ ]