| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 6
|
|
| Author |
|
|
Kai Gerstenberger
Cruncher Joined: Apr 6, 2020 Post Count: 2 Status: Offline Project Badges:
|
Hello,
I am not sure if anything can be done about it, but I observe a very annoying behavior: Sometimes I set WCG to not accept any new work units, because I want to shift the focus temporarily to something else. Usually, that will last at least a few days, so that all the WCG work units in progress will be completed, uploaded and eventually none will be left. So far, this is of course expected. But: When I set WCG to allow to process new work units, it will re-download huge files that I already had before - especially the 102 MB file mcm1.dataset-sarc1.txt. It seems like this file is always the same, and having to download it again is very time consuming. At least for my setup and location, connectivity to the server is of much greater concern than storage capacity. A bad connection, paired with the BOINC "retry delays", and this file costs me a day or two. Meanwhile, no cancer markers are mapped. Is there any way to change this behavior and keep this specific file? Either by changing my way to temporarily stop crunching WCG, or avoid the deletion of the file, or maybe I can keep a local copy that I replace every time when the work shall continue. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Are you sure the file is send as txt or maybe does it get transmitted as a zip file, which are much much smaller?
But, bhat's BOINC, clearing the 'fixed' files if there's no tasks of the related app left. You might try changing the read attribute, but while you think it's a static file, it might just be a static for the present target. Read they're preparing to switch to sarcoma.https://www.worldcommunitygrid.org/forums/wcg...ead,42752_offset,0#638381 . Strangely the sarc1 in the file name suggests they're already being crunched, but I'm not up to speed with MCM1. That said, think WCG does have the ability to label files for keeps or not. Then of course someone will complain they've got all these gunk files loitering in the data directory. |
||
|
|
Kai Gerstenberger
Cruncher Joined: Apr 6, 2020 Post Count: 2 Status: Offline Project Badges:
|
Are you sure the file is send as txt or maybe does it get transmitted as a zip file, which are much much smaller? It always requires 102 MB for the plain .txt file. I manually tried: As .zip, it would be compressed to about one third of the original size. If no other mechanism could be used, at least it would drastically reduce the problem. That said, think WCG does have the ability to label files for keeps or not. Then of course someone will complain they've got all these gunk files loitering in the data directory. I understand not everybody would be happy with keeping (temporarily) unused files. There is no obvious way to detect, if someone will need this later. Then again, following your logic, this should also happen when I select a variety of WCG projects, and by chance no MCM is currently in the pipe. More specifically: Let's say, at some point there is a mix of projects, including MCM. Now mcm1.dataset-sarc1.txt gets loaded and is available. At some later point, due to whatever scheduler decisions, only MIP, ARP and other non-MCM projects are cached. Does BOINC really "detect", that mcm1.dataset-sarc1.txt is currently not needed and delete it? I certainly did not observe that. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Don't think the cleanup is instant, rather periodic.
----------------------------------------On file compression, the definitive answer is here https://www.worldcommunitygrid.org/help/viewS...chString=file+compression "Please note that the data is compressed during transfer and is decompressed after it has been downloaded. As a result it will occupy more space on disk then the numbers shown below." [Edit 1 times, last edit by Former Member at Aug 23, 2020 1:01:11 PM] |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7844 Status: Offline Project Badges:
|
Given that you only want to switch to something else for only a few days, I would suspend any MCM units in the queue, switch for the couple days, and then switch back. This would avoid the additional download of the file in question. There is a seven day window for the completion of the MCM units, so this should not affect your machines completing the tasks in a timely fashion as long as the queue is not too big. Hope this helps.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 384 Status: Offline Project Badges:
|
Given that you only want to switch to something else for only a few days, I would suspend any MCM units in the queue, switch for the couple days, and then switch back. This would avoid the additional download of the file in question. There is a seven day window for the completion of the MCM units, so this should not affect your machines completing the tasks in a timely fashion as long as the queue is not too big. Hope this helps. Cheers Great suggestion, solves the user's problem without creating a problem for other users. One possible tweak, only suspend a single MCM unit and allow the others to complete - still saves the data file but less chance of going over deadline on restart. |
||
|
|
|