| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 21
|
|
| Author |
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
Happening again. ARP units building up in cache that won't upload.
----------------------------------------Does the ARP team know we're NOT volunteering to be their remote file servers? This is a distributed COMPUTING project, NOT a distributed STORAGE project. ![]() Currently being moderated under false pretences |
||
|
|
NixChix
Veteran Cruncher United States Joined: Apr 29, 2007 Post Count: 1187 Status: Offline Project Badges:
|
It sure appears that when ARP is fired back up, the Krembil house of cards collapses. Why is that?
----------------------------------------Cheers ![]() ![]() |
||
|
|
BobbyB
Veteran Cruncher Canada Joined: Apr 25, 2020 Post Count: 638 Status: Offline Project Badges:
|
Is it possible that when the uploads hang, like now, that no WUs can download, or is it just a coincidence?
----------------------------------------I can get a few going by retrying over and over. Does the ARP problem affect all other WCG projects because I have a hang on a machine which never does ARP (disabled in profile). [Edit 1 times, last edit by BobbyB at May 8, 2023 2:49:52 PM] |
||
|
|
PMH_UK
Veteran Cruncher UK Joined: Apr 26, 2007 Post Count: 786 Status: Offline Project Badges:
|
Is it possible that when the uploads hang, like now, that no WUs can download, or is it just a coincidence? Yes, BOINC issues messages and refuses downloads until uploads reduce. I can get a few going by retrying over and over. Does the ARP problem affect all other WCG projects because I have a hang on a machine which never does ARP (disabled in profile). Yes, any upload or download can get stuck when server overloaded. ARP most likely to overload as many large files. Paul.
Paul.
|
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
Is it possible that when the uploads hang, like now, that no WUs can download, or is it just a coincidence? Yes, BOINC issues messages and refuses downloads until uploads reduce. I can get a few going by retrying over and over. Does the ARP problem affect all other WCG projects because I have a hang on a machine which never does ARP (disabled in profile). Yes, any upload or download can get stuck when server overloaded. ARP most likely to overload as many large files. Paul. The last round of upload errors was specially frustrating for multiple reasons. A couple of ARP1 WUs, which had been crunching up to an hour or two before the deadline on some slow machines ended up "too late" or " no reply" because the were stuck for another 2 days in the upload. And as I mentioned in another thread, a lot of the upload errors happen after the actual file is transferred to 100%, the after a couple of seconds, it gets those 107 bytes error messages and you have to retry over and over and over again, wasting precious bandwidth on both sides... Ralf |
||
|
|
shauge
Cruncher Joined: Dec 10, 2005 Post Count: 19 Status: Offline Project Badges:
|
Sorry for being hard on the team
----------------------------------------![]() [Edit 2 times, last edit by shauge at May 8, 2023 10:37:27 PM] |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
It should be possible for them to calculate how much load an ARP task will put on the network and not release more ARK tasks than the network can handle. Instead they seem to release them in batches and during the batch the whole project is in a state of chaos. This is the way they prefer to run business... I understand it makes you feel good writing 'This is the way they prefer to run business...' I don't think so. They are doing their best, finding out what to do and what not, by trial and error, just like we all do. It's a small team and they want the best for all of us, including themselves, but it just takes time, so all we need is patience. Throwing words of hate, impatience and other negative responses at the WCG Team don't do any good. Now try to position yourself in the skins of the WCG Team. Would you feel good then - as part of the WCG Team - if you should get the response 'This is the way they prefer to run business...'? Adri |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1293 Status: Offline Project Badges:
|
My last upload went quickly. Is it getting better for everyone else?
|
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
My last upload went quickly. Is it getting better for everyone else? It seems to have eased a bit, but it certainly isn't solved. I still had a few WUs stuck on the upload, with that 100%+107 bytes issue...Also, slightly more SCC WUs seemed to have made it during the day... Ralf |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Recently Active Project Badges:
|
My last upload went quickly. Is it getting better for everyone else? Sort of... I'm seeing a few MCM1 and SCC1 returns failing at the first attempt then going up smoothly on the next attempt; that's definitely an improvement over lots of retries and upload rates below 20KB/second :-) but... I suspect that what has happened is that as we are now three days into the current batch of ARP1 work there aren't as many new tasks going out at once (I haven't received one for over half a day and am about to run out[1]...) so there aren't as many results coming back from folks who can turn tasks around in well under a day... Fewer (or shorter) upload requests means better upload behaviour! If there's another tranche of ARP1 or a big batch of OPNG sent out it'll probably start to bottleneck again :-( -- downloads first, then uploads as the quicker systems reply. These cyclic file transfer issues are likely to continue for as long as SHARCNET's request to WCG to reduce the number of connections remains in place and/or the servers don't have the capacity to handle the volume of requests. Cheers - Al. [1] I use small queues because I want to be able to return results for ARP1 promptly -- unfortunately, at present that's actually counter-productive if I can't get the work in the first place :-) |
||
|
|
|