World Community Grid - View Thread - Work unit availability

World Community Grid Forums

Category: Active Research

Forum: OpenPandemics - COVID-19 Project

Thread: Work unit availability

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 822

[ ]

Author

This topic has been viewed 1941603 times and has 821 replies

uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding

2 year badge for Human Proteome Folding - Phase 2

45 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

20 year badge for Nutritious Rice for the World

2 year badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

2 year badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

10 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

20 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

50 year badge for Mapping Cancer Markers

50 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

100 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

50 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: Work unit availability

2021-04-14 00:14:04.1624 [WU#618613464 OPNG_0002322_00114] handle_wu(): No canonical result yet
2021-04-14 00:14:04.2062 [CRITICAL] [RESULT#.....] Runs Invalid: All energy valuations were positive.
2021-04-14 00:14:04.2065 [CRITICAL] [RESULT#1627981583 OPNG_0002322_00114_2] checkGPUXml returned false which means it failed.

This indicates that the answers became unsuitable and invalid from a science perspective. This is actually a really good question/problem...this is a case we did not encounter during beta testing that will probably cause a change in the validation and how things are handled.

I have the result files saved from the 3 that were returned and will examine them in greater detail tomorrow (first task to do of the day).

Thank you for bringing this to my attention!
-Uplinger

[Apr 14, 2021 4:18:04 AM]

Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2567
Status: Offline
Project Badges:

10 year badge for Mapping Cancer Markers

14 day badge for FightAIDS@Home - Phase 2

1 year badge for Microbiome Immunity Project

90 day badge for Africa Rainfall Project

2 year badge for OpenPandemics - COVID-19


Re: Work unit availability

Thanks Uplinger. As you can see from the database, I'm sure, there's plenty of those all invalid WU's and/or Server Aborted due to too many invalids.

Good luck then tomorrow, in finding the reason, and why it happened to so many of the WU's released during the "mistake"

Goodnight (Good morning for me)

----------------------------------------
[Edit 1 times, last edit by Grumpy Swede at Apr 14, 2021 4:23:15 AM]

[Apr 14, 2021 4:22:45 AM]

uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:


Re: Work unit availability

It appears that there could have been a grouping of target/ligands (jobs) that were not viable and thus not good drug candidates. This failed validation, but could still be a valid scientific result. I say this seems like a grouping, because I can see them in the database...previously it was 1 in a batch randomly, which could statistically be unlucky. But, this seems like a different problem and will require more review. You can see the groupings by batch, E and R are basically the same, R stands for rerun, but that means it was marked as error atleast one attempt by a group of members.

cnt batch status
1 OPNG_0000007 E
1 OPNG_0000021 E
1 OPNG_0000028 E
1 OPNG_0000097 E
1 OPNG_0000129 E
1 OPNG_0000222 E
1 OPNG_0000279 R
1 OPNG_0000412 R
1 OPNG_0000538 R
1 OPNG_0000556 R
1 OPNG_0000610 E
1 OPNG_0000610 R
1 OPNG_0000643 R
2 OPNG_0000740 R
1 OPNG_0000747 R
1 OPNG_0000765 R
1 OPNG_0000973 R
4 OPNG_0001054 R
2 OPNG_0001069 R
1 OPNG_0001074 R
1 OPNG_0001252 E
1 OPNG_0001299 R
1 OPNG_0001347 E
1 OPNG_0001350 R
1 OPNG_0001416 R
1 OPNG_0001458 R
1 OPNG_0001459 E
1 OPNG_0001459 R
1 OPNG_0001468 R
1 OPNG_0001487 E
1 OPNG_0001493 R
1 OPNG_0001706 E
2 OPNG_0001836 E
1 OPNG_0002064 E
4 OPNG_0002227 E
5 OPNG_0002248 E
9 OPNG_0002264 E
1 OPNG_0002279 E
1 OPNG_0002302 E
138 OPNG_0002322 E
8 OPNG_0002326 E
119 OPNG_0002331 E
22 OPNG_0002341 E
26 OPNG_0002347 E
44 OPNG_0002348 E
1 OPNG_0002349 E
43 OPNG_0002370 E
19 OPNG_0002374 E
50 OPNG_0002388 E
88 OPNG_0002410 E
6 OPNG_0002414 E
55 OPNG_0002424 E
6 OPNG_0002430 E
10 OPNG_0002437 E
10 OPNG_0002445 E
8 OPNG_0002449 E
9 OPNG_0002461 E
14 OPNG_0002468 E
2 OPNG_0002473 E
12 OPNG_0002477 E
8 OPNG_0002481 E
5 OPNG_0002506 E
5 OPNG_0002507 E
2 OPNG_0002533 E

Thanks,
-Uplinger

[Apr 14, 2021 4:25:13 AM]

Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2567
Status: Offline
Project Badges:


Re: Work unit availability

Well, it seems as if you know what you're going to do tomorrow (as if you didn't have enough to do already)

So, good luck and hopefully it's an easy fix to that problem.

[Apr 14, 2021 4:33:09 AM]

maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 144
Status: Offline
Project Badges:

90 day badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

45 day badge for Nutritious Rice for the World

90 day badge for Help Fight Childhood Cancer

90 day badge for Help Cure Muscular Dystrophy - Phase 2

180 day badge for The Clean Energy Project - Phase 2

180 day badge for Computing for Clean Water

90 day badge for Drug Search for Leishmaniasis

90 day badge for GO Fight Against Malaria

14 day badge for Computing for Sustainable Water

200 year badge for Mapping Cancer Markers

180 day badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Microbiome Immunity Project

5 year badge for Africa Rainfall Project


Re: Work unit availability

This Task https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=619146340
was finished without a wingman.

----------------------------------------

AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro

[Apr 14, 2021 7:36:13 AM]

biini
Senior Cruncher
Finland
Joined: Jan 25, 2007
Post Count: 334
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Help Cure Muscular Dystrophy

45 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

2 year badge for Help Fight Childhood Cancer

14 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

10 year badge for Uncovering Genome Mysteries

10 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

20 year badge for Microbiome Immunity Project


Re: Work unit availability

Hi,

I have 4 GPUs that only one has receiving work units (rtx2070). The older intel and nvidia GPUs have not received a single GPU wu in a week. Is this expected? The rtx is receiving ~30-40 a day, all the machines has the same settings.

The older GPUs are for example Intel HD Graphics 520 (laptop) and Nvidia Quadro 2000M (laptop). Are these simply too old for the task?

----------------------------------------

rtx, xeon, i9, ryzen, rnd laptops
dAM0NES 1991 ppl interested in beer, amigas or electornic music

[Apr 14, 2021 7:58:33 AM]

sam6861
Advanced Cruncher
Joined: Mar 31, 2020
Post Count: 107
Status: Offline
Project Badges:

20 year badge for Mapping Cancer Markers

45 day badge for FightAIDS@Home - Phase 2

180 day badge for Smash Childhood Cancer

5 year badge for OpenPandemics - COVID-19


Re: Work unit availability

My Ryzen 2700x computer containing both NVidia GT 1030 and AMD RX 5500 XT, I have found that most of OPNG tasks is 90% chance it is sent to NVidia GPU, leaving my AMD GPU mostly idling.
Probably this happens:
1. Client request work for multiple GPU in a single request.
2. Server checks NVidia GPU work first, and send a few.
3. Server checks AMD/intel GPU work next, and have none left. Server ran out of available GPU tasks.
4. Request completed. Client only got a few NVidia GPU tasks and most of the time, no AMD GPU work.

There are a few invalid/bad work units. Have 1 bad one.
https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=619082847
This work unit, OPNG_0002254_00139, is
invalid, invalid, invalid, server aborted, and my computer's server aborted.

[Apr 14, 2021 8:35:23 AM]

Dayle Diamond
Senior Cruncher
Joined: Jan 31, 2013
Post Count: 452
Status: Offline
Project Badges:

1 year badge for The Clean Energy Project - Phase 2

14 day badge for Drug Search for Leishmaniasis

100 year badge for Mapping Cancer Markers

5 year badge for Uncovering Genome Mysteries

10 year badge for Smash Childhood Cancer

10 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: Work unit availability

Congratulations on this natural experiment, and thank you for upping the ongoing GPU work to 96,000 units a day.

I completely understand that WCG has to be sensitive to the needs of all projects, and how, if things go awry, one project can inadvertently harm the computation of other projects.

The addition of COVID-19 CPU units in particular have caused catastrophic slowdowns.
On November 22th, 2019, MIP processed over 228 years of work units. Yesterday, they processed 22 years.
That same day, MCM processed 341 years of work. Yesterday, it was 265.

I remember advocating for a change in work unit share at the time and am thankful that the other projects made room to accommodate high priority COVID calculations. Now that we're approaching the point where we can accommodate all projects at once, I look forward to the day when the vast majority of these disruptive CPU tasks are converted to run in their own slot so the other project timelines can return to their pre-pandemic estimates.

Hopefully the back end handles this inadvertent stress test without too many issues, and that going forward the project scientists weigh any temporary disruptions to their science against the promise that their ongoing throttling does not remain the 'new normal'.

[Apr 14, 2021 8:40:56 AM]

maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 144
Status: Offline
Project Badges:


Re: Work unit availability

Have OPN+OPG and MCM running on 7 PC's.
2/3 are MCM. See no problem in performance so long.

----------------------------------------

AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro

[Apr 14, 2021 8:54:17 AM]

erich56
Senior Cruncher
Austria
Joined: Feb 24, 2007
Post Count: 300
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Help Cure Muscular Dystrophy

1 year badge for FightAIDS@Home - Phase 2

180 day badge for Microbiome Immunity Project

14 day badge for Africa Rainfall Project


Re: Work unit availability

from what I gather reading here, quite a number of people have received GPU tasks these days.
The latest ones I received came last week, just a few ones within several days, that was it.
Have there been any changes in the background since then, so that I would have to amend my device profile settings?
What is set now is: a check at "Open Pandemics - COVID19" and "YES" for Graphic Card use.

[Apr 14, 2021 11:42:56 AM]

[ ]