| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 35
|
|
| Author |
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
I remember suggesting that tasks should be allowed to continue uploading from where they left off, if interrupted. Don't know if that was/will be implemented, even if just for some big CEP2 files. Anyone? I only have 100Kb to 130Kb uploads on this system atm or I would test it. Resuming of transfers has been part of BOINC for a very long time, but WCG doesn't allow resuming of downloads. For uploads on the other hand I don't remember ever having any problems with WCG, and a quick test with a HPF2-upload reveals upload-resuming works as it should (even the file wasn't very large). As for CEP2, I've not got any CEP2 at the moment, but don't remember any problems last time tried upload-resuming here... It is only the transfer of some files that we cannot support the resumption of file downloads. Specifically, it is files that are compressed before we place them on our filesystem that are decompressed by the BOINC client as they are downloaded. These are the files that end in .gzb. All other files should be capable of resuming downloads. There are three cases of files on the servers with regards to compressing during transfer: 1) The file will not, or will minimally, have its size reduced if compressed. For example, the jp2 files for Help Conquer Cancer. These files are not compressed during file transfer and can be resumed if interrupted. 2) Files that we gzip when we place them onto the download filesystem. These will be decompressed by BOINC as they are downloaded. We do this to because we have to save space on our storage system. They cannot be resumed because BOINC decompresses while it downloads and measures the size of the file transferred based on the decompressed size. Thus when a RANGE request is sent by the client to resume the file download, it reports an incorrect amount of data having been already transferred. Thus we have to ignore the field and have to restart the download. 3) Files that are compressed 'on the fly'. These are placed uncompressed on our download file system and when requested by the BOINC client, they are compressed by apache and decompressed by BOINC. For these type of files, if the RANGE header is detected in the request (i.e. an interrupted transfer is being resumed), then the file is resumed - but without compressing it during transfer. This is because the RANGE request refers to the uncompressed file size transferred. If apache compress the file on the fly, then it will apply the RANGE header to the compressed size - and therefore resume the transfer at the wrong part of the file. If BOINC were to download the file first (and resume transfers if needed) and then decompress the file, then both cases 2 and 3 would be able to be handle the same as case 1. However, BOINC accomplishes its decompression of gzip files using the libCurl feature. This applies the decompression before it is made available to calling application. Thus they would have to do some reworking of the client in order to download first and then decompress. We looked at how many files are resumed at it is something less than 1/10th of 1%. The files are transferred correctly - even if not completely optimally. Thus the situation remains what it is currently. [Edit 1 times, last edit by knreed at Nov 27, 2011 4:11:24 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks for you post, on a Sunday no less.
Summation. - Download/upload resume function requirement with less than 0.1% of the cases in production being interrupted makes it statistically insignificant. From tests it is suggested that the CEP2 file _4 upload does though support this (the only result file that takes est. 3/4 of total WCG transfers total from/to volunteer clients). - If there were different compression algorithms being used, it is up to the client to handle that. Implementation would risk that all who have not upgraded to that client will not be able to download and execute these tasks. Upgrading the whole volunteer population would be monumental. Conversely, those who opt-in for CEP2 may not have an issue in upgrading their client if that were needed. That ball would be in cleanenergy's court to ponder on. They are the direct recipients of the results. --//-- |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
- not only big size but an overload of WUs
I know you need evidence, but I haven't got time for or because of the pain, so please trust me. I crunch 24/7 on a PC + a laptop. The WUs sent to my PC are all good and well and I can meet the deadlines - but my laptop is assigned WUs for more hours than are in a day. I have to weed the workload regularly and it seems a waste. I just weeded - and my workload seems set to run amok again by November 29 sometime: WU list page 1 WU list page 2 I should know, but don't, so please tell me: The deadlines - are they GMT/UTC? Is it me (it habitually is) or is something wrong apart from me? |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I probably posted in the wrong Forum in the wrong thread
Sorry I dare not make new threads either, so please move it to where it belongs |
||
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
It is only the transfer of some files that we cannot support the resumption of file downloads. Specifically, it is files that are compressed before we place them on our filesystem that are decompressed by the BOINC client as they are downloaded. These are the files that end in .gzb. All other files should be capable of resuming downloads. The 66 MB large qcaux*.zip-file should be resumable, but it's not: 27.11.2011 15:02:49 | | Resuming network activity Transfer has reached 100% done but bytes transferred just continues to grow and has reached 70 MB transferred of the 66 MB-file, so suspended network. 27.11.2011 15:03:43 | | Resuming network activity And we're back to zero again, showing resuming didn't work. As a comparison, a working download-resuming of a *.zip-file from Rosetta@home: 27.11.2011 15:37:32 | | Resuming network activity ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." [Edit 1 times, last edit by Ingleside at Nov 27, 2011 2:53:35 PM] |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Darn - way for me to forget about the new architectural layer when writing my post.
----------------------------------------The new servers that we put in the IBM SmartCloud Enterprise to serve as reverse proxy caches to take advantage of less expensive/pay as you go pricing for bandwidth as well as increase our capacity don't support resuming transfers due to the way Apache handles cached content. I'll have to look at putting in detection of the Range request so that it would handle those properly so that what I wrote above would be true. thank you for pointing out that flaw in my statement. [Edit 2 times, last edit by knreed at Nov 28, 2011 3:14:26 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Reminds me, and this being the CEP2 forum, many moons ago there was the mention of 2 qcaux -duplicating- files being downloaded for -technical- reasons and that were to be changed to 1. Haven't seen the science apps downloaded for a long time, so cant remember if it was on Windows or Linux and if the change was applied. Helps to lessen the -pay as you go- :D
--//-- |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
@little mermaid: reduce your queue to the default of 0.25 days, maybe that will solve the problem.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm sorry but I'm detaching from this project, my upload speed about 17 KB/s and BOINC Manager 6.12.34 (BM) doesn't doa good job uploading: it postpones most of the jobs when I'm downloading at full speed (35 KB/s), accumulating huge amount of work units per day and I need to stop all downloads until it has finished uploading, about 4-5 hours, I never used to to do this and since my download is limited too I need all my time doing so. I'll finish the last WUs and upload them. I think the problem is the BM not the 31 MB WUs results, if it could upload constantly would be OK, but besides postponing BM sometimes doesn't take full advantage of the upload speed, I need to suspend network activity for a few seconds and the resume, only then it takes the full upload speed; I think this wasn't very important on smaller results, but now it counts and more than ever since this project is gonna increase its result sizes eventually.
----------------------------------------Oh, another thing, this project miscalculates the CPU time or is not working right, because most WU of this project say ~6 hours per WU, but often they end as early as 50% and they start uploading. Right now I don't have OC on anything, so is not heat or hardware desynchronization: my CPU cores are at 51ºC, NB at 58ºC, everything else under 46ºC; ambient temperature is 32ºC, thermal sensation 34ºC, humidity 50%, I have no air-conditioned where my PC is and right now is 17:05 GMT-4, so I think is not my PC at all. Sorry it has to be this way. [Edit 2 times, last edit by Former Member at Dec 27, 2011 12:45:48 AM] |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
This is probably the most demanding project ever run on WCG. Not everyone has the requirements to run this smoothly, so don't feel bad if you need to pull out.
----------------------------------------![]() Currently being moderated under false pretences |
||
|
|
|