Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
Member(s) browsing this thread: iyheh
Thread Status: Active
Total posts in this thread: 583
Posts: 583   Pages: 59   [ Previous Page | 12 13 14 15 16 17 18 19 20 21 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 44141 times and has 582 replies Next Thread
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 444
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Previously reported problem with PNG downloads has been successfully resolved.
Here is a selected single line from my "FINISHED DOWNLOAD" Event Log.

10/20/2025 1:52:37 AM | World Community Grid | Finished download of arp1_02_v01.png (86144 bytes)

THANKS TO ALL INVOLVED!
[Oct 20, 2025 6:06:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2508
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Previously reported problem with PNG downloads has been successfully resolved.
Here is a selected single line from my "FINISHED DOWNLOAD" Event Log.

10/20/2025 1:52:37 AM | World Community Grid | Finished download of arp1_02_v01.png (86144 bytes)

THANKS TO ALL INVOLVED!
I can confirm that. However they should at the same time also have fixed the issue with unnecessary downloads of PNG files for projects that finished a long time ago, such as mip1, hst1, and scc1. Then we wouldn't have to download 21 files (956 934 byte), for no reason whatsoever.
----------------------------------------
[Edit 2 times, last edit by Grumpy Swede at Oct 20, 2025 8:06:13 AM]
[Oct 20, 2025 8:02:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hchc
Veteran Cruncher
USA
Joined: Aug 15, 2006
Post Count: 865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

New big update here: https://www.cs.toronto.edu/~juris/jlab/wcg.html
Operational Status tab.

Does running a BOINC project really need to be this complicated?

I'm astonished with how much coding effort they are putting into this given their lack of funding and headcount.

They don't seem to be in much hurry to get work completed.
It makes you wonder how important this all is in the end ...

Still waiting for an explanation as to how a weekend cutover became a couple months redevelopment project.

I think they combined projects into the datacenter migration. So they used it as an opportunity to move away from ancient IBM systems like MessageQueue and DB2 and WebSphere, and onto modern, open systems like Docker, Kubernetes, etc. They're basically going full modern cloud architecture which should ideally enable them to scale up or scale out as demands grow and spin up new instances or something to handle busy workloads.

I'm butchering the explanation. I think they talked about it in older updates.



[indent]September 12, 2025

Configuration of Websphere and IBM MQ is taking longer than expected. We are moving all provisioning, build, and deploy stages for all repos from Ansible and Gitlab CI to Dockerfiles and docker compose files, which is a step that precedes running these containers as StatefulSets on Kubernetes. So far, we have functional containers for IBM MQ, Websphere, DB2, MariaDB, and all BOINC endpoints up and running, and what we are still struggling through is configuration.
This approach will benefit site reliability and scalability in an obvious way on Kubernetes, and will improve our development and QA lifecycles drastically. It was also necessary to preserve a maximum compatibility with the CentOS 7 virtual machines that the legacy stack was previously running on, a requirement for the redirected restore of the DB2 data for example, https://www.ibm.com/docs/en/db2/11.5.x?topic=...ming-redirected-operation.
So why are we not up, and when will we be up? We are debugging the entrypoint scripts for Websphere and IBM MQ containers. Website cannot be brought up until Websphere is up and configured correctly, receiving messages from all MQ sidecars across the stack, sending emails, etc. Each of the databases, the webserver, and the scheduler have to run MQ, and we are still adapting some of the previous mqsc and other runtime configuration for the MQ service to work with this new setup where each important container that requires one gets an MQ sidecar container that uses the Ubuntu 24.04 host VM network.

September 9, 2025

We are finalizing IBM MQ <-> DB2 <-> BOINC db <-> website axis, which will allow us to bring up the website. If all goes to plan now - we should have the website up tonight.
Once that is solved - we will go through the BOINC stack to ensure nothing catastrophic will happen when once we let traffic through to the scheduler, upload/download servers. Then we can finally start letting the BOINC daemons manipulate state in the BOINC db.

September 8, 2025

Over the weekend we were able to restore the DB2 databases for the website and forums.
It was a redirected restore that first required a fully containerized instance of DB2 running the same OS as we were in Graham cloud, and we ran into issues attempting the restore of the final backups. Both databases are now successfully restored, and we have moved on to containerizing Websphere and IBM MQ.
We were able to restore the BOINC database.
As part of our work on MAM1 we developed an integration testing environment and containerized the BOINC database.
We also did this for the BOINC server components (scheduler, upload and download servers with file_upload_handler, transitioner/validators/assimilators etc.).
Once we get IBM MQ and Websphere up, we will be able to bring the entire system online shortly afterwards.
[/indent]

I'm more excited about the new architecture updates than the datacenter migration, honestly.

Edit: They may still be on the legacy IBM and CentOS systems, but at least they're containerized in Docker files and defined in Docker Compose.
----------------------------------------
  • i5-7500 (Kaby Lake, 4C/4T) @ 3.4 GHz
  • i5-4590 (Haswell, 4C/4T) @ 3.3 GHz
  • i5-3570 (Broadwell, 4C/4T) @ 3.4 GHz

----------------------------------------
[Edit 2 times, last edit by hchc at Oct 20, 2025 8:40:07 PM]
[Oct 20, 2025 8:34:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 989
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

After a well desrved luncg today in Canada, a lot of MCM test units(range 9999900+) have started flowing 😁😁😁

Thanks to the team after "some" struggling!!


PS

All wus so far are running and uploading without any issues 😍
----------------------------------------
[Edit 2 times, last edit by Hans Sveen at Oct 21, 2025 1:02:36 PM]
[Oct 21, 2025 12:41:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1319
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Interesting flood of the short-running tasks mentioned in that last Operational Status report...

I received my first handful just before 12:25 UTC and it topped my systems up to their respective profile-specified quotas within 15 minutes!

The flow to and from client systems seems to be working o.k. -- there will be lots of connection requests because of how quickly these tasks run, but I've not seen an HTTP error on a download [yet] and uploads seem to be going through with almost no HTTP errors.

Returned tasks are showing up as Pending Validation once reported! However, I don't think they've engaged their new-look validator(s) yet, so the list of WUs with 2 results Pending Validation increases in size!

I didn't take the reference to "deployed" in that status report to mean "deployed and activated" so now we wait for the next stage...

Cheers - Al.
----------------------------------------
[Edit 2 times, last edit by alanb1951 at Oct 21, 2025 1:13:25 PM]
[Oct 21, 2025 1:07:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2508
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I'm getting a bunch of the short test tasks now. Seems to be a good flow of them.
[Oct 21, 2025 1:13:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 989
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

A new Status update just released:

October 21, 2025
Finally stress testing rather than correctness testing.
Sent a batch of 100,000 workunits (fast running, not full size in case something crashed.
Thank you for your patience and continued support.

Hans S.
[Oct 21, 2025 1:38:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Freewill
Advanced Cruncher
United States
Joined: Mar 28, 2006
Post Count: 50
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Sounds like the stress test worked from the user side. I was able to download and report without noticeable delay.
[Oct 21, 2025 1:40:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
wildhagen
Veteran Cruncher
The Netherlands
Joined: Jun 5, 2009
Post Count: 1004
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

Yes, here too. Quite a lot of them, with a runtime between 3 and 4 minutes (normally, 1:35-1:40 per workunit), all are Pending Validation now.
[Oct 21, 2025 1:41:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
WPrion
Cruncher
Joined: Apr 20, 2013
Post Count: 25
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project Status (First Post Updated)

I'm getting and completing Mapping Cancer Markers! Thanks for your persistence in getting us back to work.
----------------------------------------

[Oct 21, 2025 1:45:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 583   Pages: 59   [ Previous Page | 12 13 14 15 16 17 18 19 20 21 | Next Page ]
[ Jump to Last Post ]
Post new Thread