World Community Grid - View Thread - Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

World Community Grid Forums

Category: Official Messages

Forum: News

Thread: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 159

[ ]

Author

This topic has been viewed 31421 times and has 158 replies

MJH333
Senior Cruncher
England
Joined: Apr 3, 2021
Post Count: 300
Status: Offline
Project Badges:

50 year badge for Mapping Cancer Markers

10 year badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Savas,
Many thanks for this update and for all the efforts you and the team are putting in to get things working better.

PS Nice fax machine, by the way smile

[Nov 19, 2024 3:28:03 PM]

Greg_BE
Advanced Cruncher
Joined: May 9, 2016
Post Count: 124
Status: Offline
Project Badges:

90 day badge for Outsmart Ebola Together

90 day badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

90 day badge for Africa Rainfall Project

180 day badge for OpenPandemics - COVID-19


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Rosetta has gone to AI now for 99% of the work.
So they barely feed the PC server.

SIDock is one of the last BOINC based health projects.

DENIS@Home is a brand new project. He just had his first run before summer break and is analyzing the results. There is nothing new at the moment. There are only 9,000+ users so far.

Non BOINC there is a project called Folding At Home or FAH for short.
They take alot of the research from Baker lab (Rosetta) and work it further among other things. That is CPU and GPU based.

[Nov 19, 2024 5:24:37 PM]

TLD
Veteran Cruncher
USA
Joined: Jul 22, 2005
Post Count: 856
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

45 day badge for The Clean Energy Project

1 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

1 year badge for Help Cure Muscular Dystrophy - Phase 2

1 year badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

180 day badge for Drug Search for Leishmaniasis

180 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

100 year badge for Mapping Cancer Markers

14 day badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project

10 year badge for OpenPandemics - COVID-19


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Thanks for the update.

----------------------------------------

[Nov 19, 2024 5:57:12 PM]

TonyEllis
Senior Cruncher
Australia
Joined: Jul 9, 2008
Post Count: 286
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

2 year badge for Help Fight Childhood Cancer

2 year badge for Help Cure Muscular Dystrophy - Phase 2

180 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

10 year badge for Uncovering Genome Mysteries

10 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

20 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Thanks Savas for the update and for the considerable effort being made to improve the flow of ARP tasks. Let's hope they meet with some success.

One thing, however, was a very big surprise. Namely the continued use of CentOS 7. CentOS 7 went out of support June 30, 2024. Why would a Data Centre be running an unsupported OS? Surely a migration should have taken place for both the OS and all applications running under it many many months ago, well before end of life.

----------------------------------------

Run Time Stats https://grassmere-productions.no-ip.biz/

[Nov 19, 2024 7:04:53 PM]

TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:

10 year badge for Help Fight Childhood Cancer

2 year badge for GO Fight Against Malaria

200 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

50 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

100 year badge for OpenPandemics - COVID-19


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Thanks savas, that is a useful update from the technical side. Though I expect that some people don't understand some of the issues you mentioned with the load balancing proxy and the connection timeouts. This was in general already noted when ARP1 and OPNG collided a bit over two years ago.

Things got a bit better on Monday, after really being abysmal as the weekend progressed. I noticed the changed timeout settings, which also had an old problem reoccur, in which files would keep uploading well beyond the actual file limit. That was an issue that already happened at least once before Krembil tool over...

As for the capping of ARP1 transfers, are you capping on actual bandwidth/transfer rate (bits/sec) or do you cap on the number of concurrent transfer connection? It seems, going by the info we got the last time around, that the later would be the more effective option for the system conditions overall...

thanks again,

Ralf

[Nov 19, 2024 7:27:39 PM]

Dayle Diamond
Senior Cruncher
Joined: Jan 31, 2013
Post Count: 452
Status: Offline
Project Badges:

14 day badge for Drug Search for Leishmaniasis

20 year badge for Outsmart Ebola Together

10 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Checked the website. That's the level of communication I'd been hoping to see for some time. It's validating to see acknowledgement of what we had been saying for years - the problem wasn't bandwidth, it was bugs!

I'm getting resends in the ~130-40 range. The project ends at 180 iterations.
Some of the donors with faster computers are able to return with three a day per thread.

Since we're planning to titrate the supply of work units, what would you think of accelerating both sides of the bell curve? If something's lagging far behind, send it out first, if something's ahead, send it out second, with the majority of workunits at the lowest priority?

Obviously a work unit lagging behind delays the whole project, but wrapping one up early can decrease the maximum theoretical server load while WCG works on accommodating the rest.

[Nov 20, 2024 3:56:42 AM]

Rouxenator
Cruncher
South Africa
Joined: Nov 12, 2007
Post Count: 31
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

1 year badge for Nutritious Rice for the World

45 day badge for The Clean Energy Project - Phase 2

180 day badge for FightAIDS@Home - Phase 2

2 year badge for Microbiome Immunity Project


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Looks like my instances are doing nothing.
https://www.boincstats.com/stats/15/host/list/0/0/424620/1

Sometimes I log on to them and play a game of Retry Transfer where you just keep on clicking retry until all transfers are done. But it's not a very exciting game.

----------------------------------------

How I heat my office

[Nov 20, 2024 8:26:50 AM]

TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

It seems that at least some of the work savas mentioned has started to pay off, as downloads of new WUs goes blindingly (in WCG terms) fast since at least yesterday evening (Pacific Time), even whole ARP1 resend WUs coming in in less than 5 minutes.
Uploads however still keep getting stuck, though it will take only a few retries to finally get them out of the door.

Let's hope that this continues to be the case...

Ralf

[Nov 20, 2024 3:21:37 PM]

Greg_BE
Advanced Cruncher
Joined: May 9, 2016
Post Count: 124
Status: Offline
Project Badges:


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Nothing has changed on my system.
Aborted ARP uploads after 3 days of trying to get them sent. Unchecked that project. 5 files for 1 task? No wonder the server can't handle the load. That's just nuts. Zip them or combine them and break them down on the local system there at Kremble.

MCM - two stuck with transient errors. 5 hour wait time. So they are pretty fresh.
They uploaded a supposed 100% and then went into error, so I guess they are not confirmed as uploaded.

This kind of stupid stuff makes me really consider quitting WCG for a time.
MCM is personal to me.
I am a rare tumor (benign thankfully) survivor. The existing stains could not identify the type. Now once a year I get a blood draw and a CT scan. Part of the lab work is exactly this projects area of expertise. They do a cancer markers test.

So if anything MCM should be a priority in my opinion to get the correct fix (hardware, software, both, whatever).

[Nov 20, 2024 6:46:58 PM]

TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:


Re: Regarding ARP1 and MCM1 download issues since ARP1's launch on Monday Nov 4th, 2024

Maybe it's you?

In all seriousness, the system behaves for the about 20 hosts that I have direct control off (my own and those of the office I am working at) just as I described.
Uploads hang occasionally (1 out of 2 or 3) but a few retries gets them going, usually only once or twice today for MCM1 WUs, maybe a couple of times more for any ARP1.

Not a single download of either ARP1 or MCM1 has been stuck for me today, on any of before mentioned hosts. So far it looks as if I am back to processing 800-900 MCM1 WUs in a calendar day, a bit less than usual as there are still about a dozen ARP1 WUs (resends for all I can see) crunching, probably until some time tonight...

Ralf

[Nov 20, 2024 7:51:53 PM]

[ ]