| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 54
|
|
| Author |
|
|
Jake1402
Senior Cruncher USA Joined: Dec 30, 2005 Post Count: 181 Status: Offline Project Badges:
|
I'm just happy to see 5 machines running (42 CPU cores) and gpu wu too...79 wu so far over 5 computers...lots of download issues
----------------------------------------
Join the Chicago-IL-USA team!
2 AMD FX 8320/AMD R9 270X/Win 10 2 AMD FX 8320/AMD RX 560/Linux Mint 20.3 (both computers DOA) Intel Pentium G240/Win 10 |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Recently Active Project Badges:
|
Well, there's HTTP, and there's HTTPS. HTTPS has much greater overheads in establishing each separate connection, which probably limits the number of concurrent connects to any one given server. Well, yes and no. Yes, theoretically, there is a handshake overhead of https vs http, but for the last 15 years or so, this should not be much of a practical issue. Unless there would be a really bonehead "https-keep-alive" setting in the BOINC client, which would mean that connections are not being reused but reestablished for each of these small files.But as that would be a BOINC (client) issue rather a server side issue, I would expect that this problem would show up much more often in other projects as well, not only with WCG. And there are a lot of BOINC projects out there that work with certainly worse resources as I am sure Krembil has right now. And it would also not explain why the transfer of those <1K file would stop at that "magic" 107 byte mark, even counting in the encryption overhead in the Ethernet packet payload, that would still allow for data up to 1,400 bytes in a single packet. And such transfers/symptoms would not be explainable with https vs http. |
||
|
|
Richard Haselgrove
Senior Cruncher United Kingdom Joined: Feb 19, 2021 Post Count: 360 Status: Offline Project Badges:
|
I once encountered a problem at another project - GPUGrid. An application glitch meant that every task was failing in seconds. Instead of contacting the servers every few hours, the BOINC clients were back every 30 seconds, asking for more. GPUGrid only has one server, which handles everything - website, scheduler, uploads and downloads.
I tried to reach the website, to pass a message to the admins about the problem, but my browser couldn't establish a connection with HTTPS. I flipped back to HTTP, and got through at the first attempt. |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
If only they would spread the availability throughout the day, there would probably not be the transient errors.
Mike |
||
|
|
Richard Haselgrove
Senior Cruncher United Kingdom Joined: Feb 19, 2021 Post Count: 360 Status: Offline Project Badges:
|
That's better. Just seen 8 OPNG tasks (40 files) download in 21 seconds, no retries.
Good work. |
||
|
|
jonathandl
Advanced Cruncher Joined: Nov 12, 2007 Post Count: 106 Status: Offline Project Badges:
|
In reply to @narf57
----------------------------------------Also getting lots of new WU, but all of them have transient HTTP errors, and require multiple retries to download. At least I now have 22 WU in the queue. I just got 2, but I also get lots of transient errors. At first I thought maybe it has to do with my firewall (which does full TLS/HTTPS inspection, but has a rule specifying only certificate-inspection instead of full inspection for worldcommunitygrid.org and www.worldcommunitygrid.org). Or, more likely, my antivirus software (whose vendor retired the version I bought, and I just upgraded to the newer version around the time this happened). But, if other people are having the exact same issue, then I will hold off on troubleshooting my own installation (for example, by switching WorldCommunityGrid.org to "no inspection" or asking you guys whether I need to exempt any subdomains of worldcommunitygrid.org from firewall inspection.) [Edit 2 times, last edit by jonathandl at Aug 25, 2022 4:14:13 AM] |
||
|
|
Richard Haselgrove
Senior Cruncher United Kingdom Joined: Feb 19, 2021 Post Count: 360 Status: Offline Project Badges:
|
It looks like there was a big batch of OPNG tasks released during Canadian office hours yesterday - and the transient download errors came back with a vengeance. I managed to clear them with multiple transfer retries, but there was another block still struggling when I got up this morning.
They cleared quickly, and there has been no new OPNG work issued since. So it's a volume thing: the download server farm can't handle these big work releases. The administration team also need to understand the relationship between your servers and our clients. When our clients encounter problems with more than 3 files in succession, they sense a problem and go into extended backoff - quickly reaching several hours. And the OPNG tasks require five files to be downloaded... So this problem is counter-productive to the research. The more tasks you issue, the slower the downloads: the longer before we can start processing them: and the longer before the results can be returned. File uploads, fortunately, don't seem to have the same problem. |
||
|
|
Cyclops
Senior Cruncher Joined: Jun 13, 2022 Post Count: 295 Status: Offline |
In reply to @narf57 Also getting lots of new WU, but all of them have transient HTTP errors, and require multiple retries to download. At least I now have 22 WU in the queue. I just got 2, but I also get lots of transient errors. At first I thought maybe it has to do with my firewall (which does full TLS/HTTPS inspection, but has a rule specifying only certificate-inspection instead of full inspection for worldcommunitygrid.org and www.worldcommunitygrid.org). Or, more likely, my antivirus software (whose vendor retired the version I bought, and I just upgraded to the newer version around the time this happened). But, if other people are having the exact same issue, then I will hold off on troubleshooting my own installation (for example, by switching WorldCommunityGrid.org to "no inspection" or asking you guys whether I need to exempt any subdomains of worldcommunitygrid.org from firewall inspection.) This is not an issue on your end, some of our upload/download servers are affected by the networking issue and thus file transfers may be throttled when server resources are not available to handle them. We will update everyone in the forum when this issue is fixed. |
||
|
|
Jim Slade
Veteran Cruncher Joined: Apr 27, 2007 Post Count: 669 Status: Offline Project Badges:
|
I cannot data on my new computer. It says no work availablele to process
|
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Recently Active Project Badges:
|
I cannot data on my new computer. It says no work availablele to process Welcome to the club. It might be that they are actually working on fixing the download problem of the last two months... Ralf |
||
|
|
|