| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 29
|
|
| Author |
|
|
Cyclops
Senior Cruncher Joined: Jun 13, 2022 Post Count: 295 Status: Offline |
Hi everyone,
----------------------------------------As described earlier, WCG had transitioned from using IBM cloud infrastructure to our physical servers hosted at the University of Waterloo and supported by the Sharcnet HPC facility. Thus, the “migration” process required re-building the WCG system on a different hardware. Unfortunately, performance and capacity of our system is lower compared to IBM cloud setup. While extensive benchmarking was done to confirm it is sufficient and that the hard drive storage system would perform at least adequately for the time being, we know it is not sufficient going forward and thus we continue searching for partners and resources for upgrading our servers and the storage system. Many of the failures, errors and challenges we encountered over the transition time required continuous tweaking of the system to ensure it does not choke with increased volume of workunits or number of volunteers. It is with extreme excitement that we can announce that Sharcnet has helped us in obtaining a new storage with sufficient SSD capacity and speed to be used by WCG. The new storage should substantially improve database and scheduler performance and overall improve throughput of the workunits management system and database servers. Once operational, we will optimize our system configuration and test it before putting it into production. We will keep you updated on the timeline of implementing this upgrade. In the meantime, we would like to thank our most valuable, “alpha testers” volunteers, as without you we would not be able to finalize the system and start producing research results for the current projects. We recognize that some projects have been given more workunits to crunch than others, and we are working to equalize the distribution. ARP project is starting again with more workunits available soon and HSTB is going to re-start in the coming weeks. If you have any questions, please leave them in this forum thread. Thank you for your support, patience and understanding. WCG team at Krembil Research Institute [Edit 1 times, last edit by Cyclops at Nov 10, 2022 9:36:35 PM] |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
Well, well, well....
For one, I thnk it would be more appropriate to tone down that prep talk a bit. After all the failures for the better part of the year, for it taking months with very little effective gain, this just doesn't sound sincere anymore. Communication is still something that needs to improve, and that doesn't really depend on any new storage or benchmarking. And instead of showing the number of WUs and volunteer provided CPU years are nice, I would be at this point, after several frustrating months, much rather see something that would indicate that the number of download errors and retries is effectively going down. So far, it seems that curve would be parallel to the ones you provided so far, with the same upwards trend. Also kind of interesting to know who those "most valuable alpha testers" would be... Ralf |
||
|
|
Greg_BE
Advanced Cruncher Joined: May 9, 2016 Post Count: 124 Status: Offline Project Badges:
|
So what is the difference between SHARCNET HPC and SHARCNET (solo). Is SHARCNET going to host you or give you hardware?
You know what you are trying to say, but to us its just a lot of generalities. What I want to know, how you are going to solve this htto: transient error issue. Will SHARCNET eliminate that or is that something that is happening with your current limited physical system? Are you running on a physical system there at Krembell or are you working off of some hybrid with SHARCNET and your physical system? |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1294 Status: Offline Project Badges:
|
Thank you so much for taking WCG, and not letting it die. I'm so excited you found a solution. I look forward to crunching even more WUs !
I thank any alpha testers who helped work out the bugs, so the rest of us can have a better system. Thanks for the update ! |
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
It's telling that none of the dedicated WCG sysadmins, network admins, database admins, etc. are using the forums to communicate directly with us.
----------------------------------------Question: How fast is WCG's Internet connection? Because I'm getting 300 Kb/sec download speed when downloading the ~100 MB MCM sarcoma data file, which takes several minutes. I ask because if the throughput is this slow with a minimal number of volunteers hitting WCG, it means that WCG is already maxed out as far as Internet connection speed.
[Edit 1 times, last edit by hchc at Nov 11, 2022 12:55:09 AM] |
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
![]() Little confusing the "Total" (blue) curve is the one at the very bottom.
|
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
Question: How fast is WCG's Internet connection? Because I'm getting 300 Kb/sec download speed when downloading the ~100 MB MCM sarcoma data file, which takes several minutes. The last time I watched this file downloading, it came in for me with 1.9MB/sec, taking a little bit over a minute (and I am on a business 500/500MBit/sec connection)...Ralf |
||
|
|
Paul Schlaffer
Senior Cruncher USA Joined: Jun 12, 2005 Post Count: 278 Status: Offline Project Badges:
|
Thank you for the update. It's good to see the project is getting some new equipment that many here suspected was needed.
----------------------------------------If funding is needed to acquire equipment or additional bandwidth, that need/goal should be clearly articulated in the Donate section of the website, and announced here in News. Then those who want to help towards that end can do so. However transparency, especially a grid project like this which involves a large number of people, is key.
“Where an excess of power prevails, property of no sort is duly respected. No man is safe in his opinions, his person, his faculties, or his possessions.” – James Madison (1792)
----------------------------------------[Edit 1 times, last edit by Paul Schlaffer at Nov 11, 2022 4:00:05 AM] |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
Thank you for the update.
----------------------------------------Unfortunately, performance and capacity of our system is lower compared to IBM cloud setup. While extensive benchmarking was done to confirm it is sufficient and that the hard drive storage system would perform at least adequately for the time being, we know it is not sufficient going forward and thus we continue searching for partners and resources for upgrading our servers and the storage system. As with many things "sufficient" does not cut it. Benchmarking is a synthetic value which sometimes does not correspond with real world conditions. As an analogy, I will give the example of a restaurant. If you look at the place at 9:00 AM you would think the resources available are sufficient. But if you look at the 12:00 PM those same resources are no longer sufficient because the lunchtime crowd has arrived. What looks like it will work on paper does not always work in practice. I am reminded of an old design engineer I knew who worked for an industrial supply manufacturer. They would rate their products to work at 100% for specified conditions, but they would engineer them at 250% for those same conditions because they were proud of the reliability of their products. In other words, perhaps the benchmarking should have been aimed at sufficient plus 100%. Hopefully as the staff there gain experience this will happen. I wish you good luck in finding the partners and resources for the upgrades on your hardware. Cheers Edit:spelling
Sgt. Joe
----------------------------------------*Minnesota Crunchers* [Edit 1 times, last edit by Sgt.Joe at Nov 11, 2022 1:41:22 PM] |
||
|
|
thunder7
Senior Cruncher Netherlands Joined: Mar 6, 2013 Post Count: 238 Status: Offline Project Badges:
|
Also kind of interesting to know who those "most valuable alpha testers" would be... Ralf I still can't shake the feeling WE all are the alpha testers. |
||
|
|
|