| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 352
|
|
| Author |
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
The Compute Canada status page is reporting that Nibi has a power outage (flagged up 3 days ago, still open...). It appears they are still having some fairly severe "snagging" issues :-(
Cheers - Al. |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1294 Status: Offline Project Badges:
|
Thanks for the info Al. I missed looking at that page.
The old system and the new system have to share electricity, and that is why the old system was reduced to 25%, but this looks like a further electrical issue. It gives me something to keep my eye on for signs of progress though. |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1294 Status: Offline Project Badges:
|
Here is the new update :
----------------------------------------July 28, 2025 MAM1 Beta 7.05 is in the final stages of alpha testing, debugging checkpointing issues under the BOINC client, upon release for linux will require GLIBC > 2.29: We forked the ensmallen library, "a high-quality C++ library for non-linear numerical optimization", and with cereal implemented serializable versions of Simulated Annealing (SA) and Particle Swarm Optimization (PSO) algorithms from the ensmallen library to supplement the existing random methods in MCM1 and MAM1. The new multi-stage MAM1 build compiles and statically links to LibTorch, OpenCV, dlib, and Apache Arrow for Parquet support, providing a state-of-the-art platform for general scientific inquiry and launching MAM1 and future projects. In addition to SVM, we now support RandomForest, AdaBoost, and Neural Networks (MLP/CNN) as classifiers to train and evaluate promising gene signatures identified by heuristic search of candidate gene signatures via SA and PSO. We have been working towards a CUDA-enabled LibTorch build for the application as well, and only linker errors remain before we can test setting torch::DeviceType to torch::kCUDA in place of torch::kCPU and see if it might be that easy. [Edit 1 times, last edit by Unixchick at Jul 28, 2025 11:13:00 PM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
When I saw the new update my first reaction as a Linux user was that library version, as there have recently been posts at CPDN about some RedHat-based Linux versions in common use having an older main C library than that needed by recently rebuilt software there!
So I have just sifted through my recent wingmen data to see how many cases of pre-2.3x library versions I could see... There are still quite a few Debian, Ubuntu and Linux Mint users still on relatively old kernels (and hence older libraries), but there are also some Red Hat Enterprise Linux 8.10 users. That said, they contributed less than 2% of the wingmen results I looked at. I think that [nearly] all users ought to be able to get up to an acceptable library version; RHEL 9, recent Fedora releases and the latest CentOS seem to be o.k., and anything recent that is loosely based on Debian or Ubuntu should be o.k. too. So it looks as if that won't be an issue for the majority of Linux users. I have no idea what will happen to FreeBSD users... That said, it'll be interesting to watch for the error reports when the Betas restart :-) Cheers - Al. |
||
|
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 442 Status: Offline Project Badges:
|
I wish we had a time frame for releasing this round of beta testing.
But I AM grateful for the status report. Thank You! |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1294 Status: Offline Project Badges:
|
We are getting a good flow of MCM and ARP . Good times.
----------------------------------------The Nibi power issue is resolved. [Edit 1 times, last edit by Unixchick at Jul 30, 2025 3:34:53 PM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
I see they had another outage at nibi on the 29th, now resolved. However, I find the following a bit worrying:
----------------------------------------We believe the cause of this was an unexpected interaction between high-availability features (quorum of hypervisers) with more than one hardware failure. We think we can prevent similar things from happening in the future. Still in the snagging period, it seems!Cheers - Al. [Edit] I forgot to add that the power outage item was amended to say it cleared on the day it happened... [Edit 1 times, last edit by alanb1951 at Jul 30, 2025 7:29:04 PM] |
||
|
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 2173 Status: Offline Project Badges:
|
We are getting a good flow of MCM and ARP . Good times. Not sure what the power outage issue was, but I can confirm that all my hosts have their queues full, got even a full dozen ARP1 WUs.The Nibi power issue is resolved. Let's see if this holds up for a week or longer, for a change... Ralf |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
And we have BETA test action at last...
I've posted a new thread in the Beta Test forum in the vain hope that users won't generate a multiplicity of threads with lots of repeated questions or information in them -- let's wait and see. I can only comment about Linux; I suspect that's where they've started off but if anyone sights Windows examples... I've not had time to have a really good look at what's happened so far, but a quick glance shows run times ranging from about 50 minutes to just over 5 hours. I suspect this batch is by way of a tuning exercise as much as anything else! I might post something in the Beta thread after I've had a chance to look at the parameters (which, fortunately, are shown as debug information in the stderr file); the new program uses a completely different control file so my existing catch-scripts (familiar with MCM1 and the first MAM1 beta) didn't pick up any configuration information. More programming coming up (sigh...) Cheers - Al. |
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1294 Status: Offline Project Badges:
|
We have 3 projects flowing today.
Please post on Al's thread if you are one of the lucky ones to get a MAM Beta. With such a small amount of WUs going out in a burst it is hard to tell when it is flowing or not, but we are in the testing phase and WUs went out within the last 24 hours so I'll leave the status green ARP is flowing. We are nearing the end of the 149 group, but it is harder to tell as they no longer come in a strict order. I'm guessing they split the WUs into batches for different machines to deliver to us. The question is will they start group 150 right away or wait until Monday?? MCM are flowing well. On a personal note. I'm now running 4 ARPs at a time and getting ever closer to my 2 year badge. |
||
|
|
|