| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 8
|
|
| Author |
|
|
David Autumns
Ace Cruncher UK Joined: Nov 16, 2004 Post Count: 11062 Status: Offline Project Badges:
|
I have had a dodgy broadband connection overnight and BOINC doesn't seem to have recovered gracefully from it????
----------------------------------------There are uploads on there that are 5 hours stuck at 0% and not progressing to completion now I'm back on air. I'm running 5.8.8 clients Should it recover? and by when? Just thought you might like to know of this scenario Dave It shouldn't need intervention to get back under way ![]() [Edit 1 times, last edit by David Autumns at Mar 3, 2007 11:08:22 AM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Well Dave, 5.8.8 is a development dodge.... get 5.8.15. Suspend network in the agent completely, exit, upgrade, activate network should put u back on the road.....
----------------------------------------Get 5.8.16 and it also records - if u set it in the cc_config.xml - the checkpoints in message log!
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
David Autumns
Ace Cruncher UK Joined: Nov 16, 2004 Post Count: 11062 Status: Offline Project Badges:
|
I'll give that a try
----------------------------------------it's just all those potential 5.8.8 boxes on hold I'm concerned about If you have some server tucked away in a dark corner somewhere it should recover and not be taken out of action by a dodgy Internet Connection I have one box not running still trying after 8 1/2 hours to send it's one and only completed work unit and it isn't picking up any more. I'm prepared to leave as is if some good could be found from it's current condition. No amount of "update" button pressing is going to snap it out As you can see my Internet connection is restored Is there another timer that will expire and kick it back into life? ![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
There's a small world. Running a low weighted project that had build up quite a bit of LTD so resumed running and finished about 02:00 AM. Than apparently it could not find the server and went into retry for a while and gave up, logged a 'starting upload' and started clocking ad infinitum. After 6 hours it just sat adding time to the counter. Update button or retry, nothing budged. Suspended network and activated, and away it went..... this is 5.8.5 (still). Lesson indeed it, that remote / unattended boxes, could end up not getting any after a failed contact.
----------------------------------------BUT, now the good news for the multi project crunchers. It just started on the next job for WCG at 02:00AM and finished and send and pulled new work while this was ongoing. It thus seems to be something limited to the file walking into the failed transmission loop for any single project (just 1). Read last week in the CVS that from somewhere near 5.8.11 the number of simultaneous transmissions for upload and download was increased so that there would be less chance of one waiting on the other..... message: Upgrade.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
David Autumns
Ace Cruncher UK Joined: Nov 16, 2004 Post Count: 11062 Status: Offline Project Badges:
|
No that's still not the case I run all the projects and 5.8.8 stuck there for over 13 hours in the end then as no-one seemed to be interested I abandonned the WU so I could start doing something useful again.
----------------------------------------There's a hole you can get yourself into that doesn't have a timer to clear up the mess afterwards. You can have 2 simultaneous uploads on 5.8.8 but that's no use if both of them are stuck in limbo. I'd like the IBM Techs to take a look if possible we could have a great number of stalled crunching boxes out there with owners who are unaware. Dvae ![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Send a contact_us to the WCG technicians.
----------------------------------------Added: 152 CPU years on Monday, best day since Feb.15, does not suggest for many to have remained on 5.8.8
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Mar 6, 2007 8:28:04 AM] |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
There is a known issue in the BOINC 5.8 client where the upload transmission can get stuck. They reduced the problem with the 5.8.15 build but the alpha testers are still recreating the problem with the (alpha) 5.8.16 build. This is one of the last bugs that they are trying to resolve with the 5.8 client overall. It is also due to this type of bug that we are very slow to adopt the latest version of the BOINC client as we want to minimize the impact of these type of problems.
I believe that the timeout out is 14 days (same as the download timeout). |
||
|
|
David Autumns
Ace Cruncher UK Joined: Nov 16, 2004 Post Count: 11062 Status: Offline Project Badges:
|
Thanks knreed
----------------------------------------It's reassuring to know that it does get kicked up the air eventually Cheers and roll on 5.9 Dave ![]() |
||
|
|
|