| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 44
|
|
| Author |
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
It would appear CEP2 WU's were effected by the outage as none of my CEP2 WU's were able to upload nor was I able to receive additional CEP2 WU's. Though these result files go by exception to Harvard directly, the scheduler that manages this is still inside the WCG daemon as are task downloads i.e. nothing at all went to completion for probably from about half a day.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
> But Rick did not mention CEP2 in his post so far.
----------------------------------------No CEP2, just c4cw Beta and DDDDDDDT2 > Every time I have watched a retry BOINC was starting to transfer a few files as I described, and then it backed off the whole list without trying for every file in the list. At least after a while, each file in my queues had its own retry time, and was retried individually. The upload progress indications advanced at rates that corresponded to real data transfers happening. I could not correlate BOINC upload activity with my modem/router's activity LEDs due to other Net traffic. BOINC 6.2.19, AFAIK the same as official WCG BOINC 6.2.28 but without the WCG logos etc. (OT: I tried a later version of BOINC once and hated it. They had removed the display of the no of seconds of work being fetched - something which gives me a useful feel for work cache behaviour. Dumbed down, like Windows ME). [Edit]: Just noted Ingleside's post re project-wide backoff times in later versions of BOINC. I guess that's the reason for difference between JmBoullier's and my observations. [Edit 1 times, last edit by Rickjb at Aug 16, 2010 3:38:20 PM] |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
v6.10.xx also has a new Project-wide backoff, so with 3 transfer-errors in a row to same project (**), the whole project will get a random backoff, and this again will be between 1 minute and 4 hours. This is so fast, multi-core computers that can have many file-transfers in case of problems, won't be continuously trying a different transfer as the individual file-backoffs times-out. While project has a project-wide backoff, any new uploads won't be tried immediately, but will wait for the project-wide backoff to count-down. OK, this is explaining why I was not seeing the same things as Rickjb, all my devices are 6.10.xx currently and Rick's ones are Which is not a problem per se, but supporting members with so many different versions around with almost as many variable behaviors begins to be very challenging... ![]() Anyway, that's what keeps life funny. Edit: Rickjb has provided info on his versions while I was writing my post. ---------------------------------------- [Edit 1 times, last edit by JmBoullier at Aug 16, 2010 3:15:21 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
(OT: I tried a later version of BOINC once and hated it. They had removed the display of the no of seconds of work being fetched - something which gives me a useful feel for work cache behaviour. Dumbed down, like Windows ME). Set some log flags and it will be back. The default log was just 'MEnimalized' 828 World Community Grid 14-08-2010 15:10:16 [sched_op_debug] Starting scheduler request 829 World Community Grid 14-08-2010 15:10:16 Sending scheduler request: To fetch work. 830 World Community Grid 14-08-2010 15:10:16 Requesting new tasks 831 World Community Grid 14-08-2010 15:10:16 [sched_op_debug] CPU work request: 62467.89 seconds; 0.00 CPUs 832 World Community Grid 14-08-2010 15:10:19 Scheduler request completed: got 1 new tasks 833 World Community Grid 14-08-2010 15:10:19 [sched_op_debug] Server version 601 834 World Community Grid 14-08-2010 15:10:19 Project requested delay of 11 seconds 835 World Community Grid 14-08-2010 15:10:19 [sched_op_debug] estimated total CPU job duration: 14356 seconds
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Aug 16, 2010 3:18:09 PM] |
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
I repeat my suggestions:
1). During future WCG outages like this one, I think it would be good if the techs could completely disable incoming file transfers. Works for all versions of BOINC, with all settings. 2). If the forum is likely to go offline, don't post status reports in there, as we won't be able to read them. If feasible, plug in a different "server" that displays a simple HTML status report page, and put updates there. Maybe you can just plug an ethernet cable feed into knreed's suitably-configured netbook. Works for all kinds of outages, provided his battery has plenty of kick. WCG member making suggestions ===> ![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Lets see: I've got a crisis, work my bud off on Sunday, have this flash moment whilst 100 other thoughts race through my gray mass to find the cause of the collapse and think: Let's put this message on the forums, the most commonly known medium so as soon as it comes up, it will be there... and then there was a message on Facebook and on the Berkeley forums, News on Project Outages.
----------------------------------------Most all crashes are unique at WCG and being IBM, they document and note and put a postmortem report in place which surely will include your suggestions. And now for the relaxation: http://www.cartoonstock.com/directory/s/stand_in_line.asp There's one for everyone ;-)
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3716 Status: Offline Project Badges:
|
2). If the forum is likely to go offline, don't post status reports in there, as we won't be able to read them. ??? Why this comment?Anybody can check that the forum was available when kneed has put his first post in the Known Issues. 07:xx UTC The whole site falls down 14:17 UTC knreed's first post in the Known Issues (KI) 14:21 UTC knreed confirms in his KI thread that the website and the forum are OK 14:23 UTC Jean Pierre. opens this thread 14:45 UTC nasher answers JP 14:52 UTC onward Several other posts after nasher's 15:08 UTC knreed announces in his KI thread that uploads are back 16:12 UTC knreed announces in his KI thread that everything is working again It seems obvious to me that knreed has not posted in the forum when the forum was likely to fall down but when he was pretty sure it was working again. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If the forum is likely to go offline, don't post status reports in there, as we won't be able to read them How can you post to the forum if it is going down? knreed is way smarter than you give him credit. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello WCG
... Good job WCG!"16:12 UTC knreed announces in his KI thread that everything is working again" - JmBoullier Community Advisor [Aug 17, 2010 2:53:22 PM] post Good day ; |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Good job WCG! Nice ![]() |
||
|
|
|