| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| Member(s) browsing this thread: iyheh |
|
Thread Status: Active Total posts in this thread: 583
|
|
| Author |
|
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 444 Status: Offline Project Badges:
|
Previously reported problem with PNG downloads has been successfully resolved.
Here is a selected single line from my "FINISHED DOWNLOAD" Event Log. 10/20/2025 1:52:37 AM | World Community Grid | Finished download of arp1_02_v01.png (86144 bytes) THANKS TO ALL INVOLVED! |
||
|
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2508 Status: Recently Active Project Badges:
|
Previously reported problem with PNG downloads has been successfully resolved. I can confirm that. However they should at the same time also have fixed the issue with unnecessary downloads of PNG files for projects that finished a long time ago, such as mip1, hst1, and scc1. Then we wouldn't have to download 21 files (956 934 byte), for no reason whatsoever.Here is a selected single line from my "FINISHED DOWNLOAD" Event Log. 10/20/2025 1:52:37 AM | World Community Grid | Finished download of arp1_02_v01.png (86144 bytes) THANKS TO ALL INVOLVED! [Edit 2 times, last edit by Grumpy Swede at Oct 20, 2025 8:06:13 AM] |
||
|
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 865 Status: Offline Project Badges:
|
New big update here: https://www.cs.toronto.edu/~juris/jlab/wcg.html Operational Status tab. Does running a BOINC project really need to be this complicated? I'm astonished with how much coding effort they are putting into this given their lack of funding and headcount. They don't seem to be in much hurry to get work completed. It makes you wonder how important this all is in the end ... Still waiting for an explanation as to how a weekend cutover became a couple months redevelopment project. I think they combined projects into the datacenter migration. So they used it as an opportunity to move away from ancient IBM systems like MessageQueue and DB2 and WebSphere, and onto modern, open systems like Docker, Kubernetes, etc. They're basically going full modern cloud architecture which should ideally enable them to scale up or scale out as demands grow and spin up new instances or something to handle busy workloads. I'm butchering the explanation. I think they talked about it in older updates. [indent]September 12, 2025 Configuration of Websphere and IBM MQ is taking longer than expected. We are moving all provisioning, build, and deploy stages for all repos from Ansible and Gitlab CI to Dockerfiles and docker compose files, which is a step that precedes running these containers as StatefulSets on Kubernetes. So far, we have functional containers for IBM MQ, Websphere, DB2, MariaDB, and all BOINC endpoints up and running, and what we are still struggling through is configuration. This approach will benefit site reliability and scalability in an obvious way on Kubernetes, and will improve our development and QA lifecycles drastically. It was also necessary to preserve a maximum compatibility with the CentOS 7 virtual machines that the legacy stack was previously running on, a requirement for the redirected restore of the DB2 data for example, https://www.ibm.com/docs/en/db2/11.5.x?topic=...ming-redirected-operation. So why are we not up, and when will we be up? We are debugging the entrypoint scripts for Websphere and IBM MQ containers. Website cannot be brought up until Websphere is up and configured correctly, receiving messages from all MQ sidecars across the stack, sending emails, etc. Each of the databases, the webserver, and the scheduler have to run MQ, and we are still adapting some of the previous mqsc and other runtime configuration for the MQ service to work with this new setup where each important container that requires one gets an MQ sidecar container that uses the Ubuntu 24.04 host VM network. September 9, 2025 We are finalizing IBM MQ <-> DB2 <-> BOINC db <-> website axis, which will allow us to bring up the website. If all goes to plan now - we should have the website up tonight. Once that is solved - we will go through the BOINC stack to ensure nothing catastrophic will happen when once we let traffic through to the scheduler, upload/download servers. Then we can finally start letting the BOINC daemons manipulate state in the BOINC db. September 8, 2025 Over the weekend we were able to restore the DB2 databases for the website and forums. It was a redirected restore that first required a fully containerized instance of DB2 running the same OS as we were in Graham cloud, and we ran into issues attempting the restore of the final backups. Both databases are now successfully restored, and we have moved on to containerizing Websphere and IBM MQ. We were able to restore the BOINC database. As part of our work on MAM1 we developed an integration testing environment and containerized the BOINC database. We also did this for the BOINC server components (scheduler, upload and download servers with file_upload_handler, transitioner/validators/assimilators etc.). Once we get IBM MQ and Websphere up, we will be able to bring the entire system online shortly afterwards. [/indent] I'm more excited about the new architecture updates than the datacenter migration, honestly. Edit: They may still be on the legacy IBM and CentOS systems, but at least they're containerized in Docker files and defined in Docker Compose.
[Edit 2 times, last edit by hchc at Oct 20, 2025 8:40:07 PM] |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 989 Status: Offline Project Badges:
|
After a well desrved luncg today in Canada, a lot of MCM test units(range 9999900+) have started flowing 😁😁😁
----------------------------------------Thanks to the team after "some" struggling!! PS All wus so far are running and uploading without any issues 😍 [Edit 2 times, last edit by Hans Sveen at Oct 21, 2025 1:02:36 PM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1319 Status: Offline Project Badges:
|
Interesting flood of the short-running tasks mentioned in that last Operational Status report...
----------------------------------------I received my first handful just before 12:25 UTC and it topped my systems up to their respective profile-specified quotas within 15 minutes! The flow to and from client systems seems to be working o.k. -- there will be lots of connection requests because of how quickly these tasks run, but I've not seen an HTTP error on a download [yet] and uploads seem to be going through with almost no HTTP errors. Returned tasks are showing up as Pending Validation once reported! However, I don't think they've engaged their new-look validator(s) yet, so the list of WUs with 2 results Pending Validation increases in size! I didn't take the reference to "deployed" in that status report to mean "deployed and activated" so now we wait for the next stage... Cheers - Al. [Edit 2 times, last edit by alanb1951 at Oct 21, 2025 1:13:25 PM] |
||
|
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2508 Status: Recently Active Project Badges:
|
I'm getting a bunch of the short test tasks now. Seems to be a good flow of them.
|
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 989 Status: Offline Project Badges:
|
A new Status update just released:
October 21, 2025 Finally stress testing rather than correctness testing. Sent a batch of 100,000 workunits (fast running, not full size in case something crashed. Thank you for your patience and continued support. Hans S. |
||
|
|
Freewill
Advanced Cruncher United States Joined: Mar 28, 2006 Post Count: 50 Status: Offline Project Badges:
|
Sounds like the stress test worked from the user side. I was able to download and report without noticeable delay.
|
||
|
|
wildhagen
Veteran Cruncher The Netherlands Joined: Jun 5, 2009 Post Count: 1004 Status: Offline Project Badges:
|
Yes, here too. Quite a lot of them, with a runtime between 3 and 4 minutes (normally, 1:35-1:40 per workunit), all are Pending Validation now.
|
||
|
|
WPrion
Cruncher Joined: Apr 20, 2013 Post Count: 25 Status: Offline Project Badges:
|
I'm getting and completing Mapping Cancer Markers! Thanks for your persistence in getting us back to work.
----------------------------------------![]() |
||
|
|
|