Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Nutritious Rice for the World Thread: Slowing down Rice project |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 43
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just out of curiousity, how much disk space are we talking about? This is totally UNofficial, and just a little number crunching based on the information on the website, but I believe they will need about 4 terabytes of storage space, total. 4 actual terabytes (1024^4). I got this because they report that it takes 10GB to hold the predicted structures (results) from 100 proteins, they they are testing approximately 40,000 proteins. which yields 4000GB, or about 3.9TB. Rounding makes it easier. 4TB is a hard thing to come by on short notice. In the first 3 and a half months on the Grid, we crunched about 6800 proteins, which means they need about 200 more gigabytes each month. I think its smart to throttle down the project if they need help to come up with the space. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This would be a complicated thing to do but I know it is done today. But it would be a cool feature/option for times like these where the users could "volunteer" disk space on their machines for temporary storage (Distributed Storage). This, of course, would be for work completed that is not needed right away as you'd have to make sure the computer is online in order to retrieve the data. Might just be another way the WCG grid clients/users could help out the research teams. Again, this is complicated and would probably never be made available...just a cool idea in my head.
|
||
|
kskjold
Senior Cruncher Norway Joined: May 20, 2008 Post Count: 469 Status: Offline Project Badges: |
If it had been possible, I would WCG use between 5 and 10 Gb on my computers for storage. They could retrive it when the boinc client downloads more wu or something......
----------------------------------------I think it's a good idea, and if it getting possible in the future I think there is many crunshers thet would lend WCG some space on their disks also |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
We discussed the advantages and drawbacks of distributed storage not very long ago.
It would be nice if it worked, but it's a lot harder than it sounds. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yea, there is a service out there right now that you can do this with. I can't remember the name but I looked into them. You can either buy storage and upload it which then in turn downloads it to other computers that are members of the service...replicating it across many many machines for redundancy (encrypted of course). Or, you can "loan" say 50GB worth of space so then you can use 50GB worth of space from the service. If you are "loaning" space then pretty much you are using the service as a RAID for redundancy.
I think it would be something interesting to look into as WCG gets more projects and those researchers are already on very tight budgets and might not be able to afford the space right away (ie, NRW). I personally would loan about 100GB or so. I have a 500GB drive laying around that I could even bring online too. |
||
|
Zigfried
Senior Cruncher Brazil Joined: Dec 12, 2005 Post Count: 368 Status: Offline Project Badges: |
it is really complicated to donate space in our HDs to store resoults. Imagine if each one of us doneta 10 or 50GB and then you got a virus. We will lost so much information everyday. Store critical information is something really important so we need backup everything and need to do it so frequently. We need a better estimation in order to know how much space will be necessary for a specific project and ensure that we have it or at least have plans to incrise the space during the process.
----------------------------------------[Edit 1 times, last edit by Zigfried at Dec 31, 2008 2:43:38 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If a computer goes offline for whatever reason...with distributed storage that is not an issue. The data is replicated multiple times across many computers in many different configurations. So if my computer crashes or just isn't online at the time they are trying to transfer data, no biggie, that same data would be available on many other computers as well.
I'm not saying it is easy or even efficient (as far as storage size, say you store 100GB...that may in the end actually take up 400 or 500GB after you factor in redundacy)....but it is effective and redundant. As far as Distributed Storage over the internet...it is in it's infancy...but growing. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
We need a better estimation in order to know how much space will be necessary for a specific project and ensure that we have it or at least have plans to incrise the space during the process. I agree, but it can be shocking for a researcher who is unfamiliar with the power of the Grid when results start pouring in. I've seen interviews with the lead researcher of the project, and they were running this on something like 400 computers before WCG came along. Thats a huge difference in processing power and results. And its not like the project has ground to a halt, we are still crunching it, just not as fast as before. Thats how a grid works, ideally. Projects can be turned up or down based on the needs of each project. NRW needs a break, so we give it to them. When they find the space, we can turn them right back up to where they were earlier. The same is true for any other project. |
||
|
GIBA
Ace Cruncher Joined: Apr 25, 2005 Post Count: 5374 Status: Offline |
I am surprise that the scientific team of this project don't be prepared to storage more results than planned and mainly with the fact that the returned results quantity probablçy are more than expected since start times and growth each day more...
----------------------------------------I can imagine some questions about availabilities of storage technologies envolved for this scientists keep actual volume of data and the new data send by WCG each day and in more high speed than expected. Do not appear be an unfeasible challenge for this scientists to be solve in short time (it is my guess) and expect that this team take actions asap to put it in smooth way. Probably after it, WCG will take some precautions in new projects to advise scientists about the real possibility of its results appear early than expected and in this way be prepared to storage lot of data early than expected...
Cheers ! GIB@
Join BRASIL - BRAZIL@GRID team and be very happy ! http://www.worldcommunitygrid.org/team/viewTeamInfo.do?teamId=DF99KT5DN1 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Some of us do have a couple of TBs of storage, got 5 on the server as we speak, would be happy to store stuff, I bet COMCAST would have a fit with the bandwith it would use, LOL!
|
||
|
|