World Community Grid - View Thread - New Beta Test for PC v7.10

World Community Grid Forums

Category: Beta Testing

Forum: Beta Test Support Forum

Thread: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 179

[ ]

Author

This topic has been viewed 874239 times and has 178 replies

Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1414
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

2 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

2 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

20 year badge for Mapping Cancer Markers

2 year badge for Uncovering Genome Mysteries

20 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

5 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

It's very well possible the Elapsed time is dubious [did it go back by the same amount on restart as the CPU time?]. There's some fixes to the time-keeping in the very latest clients.

I used a BOINC patch version 7.7.0 and got higher CPU-times than elapsed.
I had over 100% efficiency on all tasks. The result page shows the same times for elapsed and cpu.
BoincTasks showed for the last four tasks:

Elapsed- / CPU-time
19:48:45 (20:07:39) Result page stored 20.13 / 20.13
19:50:19 (20:21:43) Result page stored 20.36 / 20.36
21:38:11 (22:09:27) Result page stored 22.16 / 22.16
19:38:00 (20:10:05) Result page stored 20.17 / 20.17

I'll install recommended version 7.6.9 to see how the times are with that version.
Maybe there's a fix in it, that wasn't in the standalone patch.

[Sep 2, 2015 4:46:27 PM]

uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:

10 year badge for Human Proteome Folding

45 day badge for Help Cure Muscular Dystrophy

2 year badge for Discovering Dengue Drugs - Together

20 year badge for Nutritious Rice for the World

2 year badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

2 year badge for Influenza Antiviral Drug Search

10 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

20 year badge for GO Fight Against Malaria

50 year badge for Mapping Cancer Markers

50 year badge for Uncovering Genome Mysteries

100 year badge for FightAIDS@Home - Phase 2

50 year badge for Microbiome Immunity Project


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

I've computed 7 Beta WUs so far.
I've noticed following by WCG-only hosts with a CPU efficiency over 99%, no restart:

i7 4770K, Windows 7 Pro SP1 x64
- BETA_avx101118-040_r8_1_wcgfahb00200000, 10.45 hours, 307.6 granted credit (no wingman)
- BETA_avx101118-034_r16_1_wcgfahb100000, 10.61 hours, 398.5 granted credit (wingman 14.66 hours)
Phenom II x6, Ubuntu 14.04 x64, 1090T
- BETA_avx101118-044_r17_1_wcgfahb00500000, 26.21 hours, 443.3 granted credit (no wingman)
- BETA_avx101118-053_r3_1_wcgfahb00300000, 29.02 hours, 443.3 granted credit (no wingman)
- BETA_avx101118-049_r19_1_wcgfahb00500000, 26.29 hours, 443.3 granted credit (no wingman)
- BETA_avx101118-059_r7_1_wcgfahb00300000, 25.64 hours, 443.3 granted credit (no wingman)
Phenom II x6, Ubuntu 14.04 x64, 1055T
- BETA_avx101118-028_r5_1_wcgfahb00100000, 29.41 hours, 443.3 granted credit (no wingman)

I have several remarks regarding the crazy credit/hour ratio as well as the duration.

The Ubuntu application should be strongly optimized.
The credit calculation for long WUs must be modified (whatever the duration is, 443.3 seems to be the only possible granted credit)
Phenom II CPUs are not efficient for this project, even if I do not currently notice problem with the both Phenom II hosts (about 50 granted credits/hour for OET1).

Cheers,
Yves

I will be reviewing the points given on this, it is on my plate of things to investigate/improve upon.

Thanks,
-Uplinger

[Sep 3, 2015 2:24:05 AM]

uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Thanks for the clarification.
I didn't catch that many beta WUs in the past and wasn't aware that this is a known problem in the field of the beta test.

As soon as I'm home I can look for a log and post the content. But the trickle messages shouldn't be the problem as I could observe a growth of elapsed time for the WU. In my understanding this can only mean that the trickle messages were received and validated by the server.

Rarusu,

What Sek has posted is correct. There are currently two bugs I'm working through right now on the validator and transitioner which are both backend systems. What you are seeing is your machine was not given a "hard stop" message before the deadline. In this case you would have been granted the credit for work done so far, then the next generation work unit would have been created off of how far you have gotten. I would suspect if your machine worked 24/7 on it, you got a pretty good chunk completed. I am hopeful I'll have that part fixed first. Then I will be moving on to the transitioner bug, which is less critical for lost work.

Thanks,
-Uplinger

I have recently put into place the fix for the hard stop/soft stop script that runs on the backend. Members may start seeing more of these messages as they get closer to deadlines.

Thanks,
-Uplinger

[Sep 3, 2015 2:25:55 AM]

Rarusu
Advanced Cruncher
Germany
Joined: Feb 7, 2006
Post Count: 64
Status: Offline
Project Badges:

1 year badge for Human Proteome Folding - Phase 2

14 day badge for Discovering Dengue Drugs - Together

90 day badge for Nutritious Rice for the World

45 day badge for The Clean Energy Project

45 day badge for Influenza Antiviral Drug Search

180 day badge for Help Cure Muscular Dystrophy - Phase 2

45 day badge for Discovering Dengue Drugs - Together - Phase 2

1 year badge for Drug Search for Leishmaniasis

180 day badge for Computing for Sustainable Water

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

5 year badge for Africa Rainfall Project

5 year badge for OpenPandemics - COVID-19


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

I have recently put into place the fix for the hard stop/soft stop script that runs on the backend. Members may start seeing more of these messages as they get closer to deadlines.

Thanks,
-Uplinger

Thanks for the update, uplinger.

I will keep an eye on this as soon as I receive a new beta WU.

Cheers
Rarusu

----------------------------------------

Cheers,
Rarusu

[Sep 3, 2015 5:49:59 AM]

KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

180 day badge for Help Cure Muscular Dystrophy

5 year badge for Nutritious Rice for the World

10 year badge for Help Fight Childhood Cancer

20 year badge for Help Cure Muscular Dystrophy - Phase 2

5 year badge for The Clean Energy Project - Phase 2

5 year badge for Drug Search for Leishmaniasis

100 year badge for Mapping Cancer Markers

10 year badge for Uncovering Genome Mysteries

50 year badge for Outsmart Ebola Together

5 year badge for FightAIDS@Home - Phase 2

10 year badge for Microbiome Immunity Project

20 year badge for Africa Rainfall Project


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

@Uplinger
In advance, I thank you for your investigation.
Cheers,
Yves

----------------------------------------

Décrypthon team progress - KerSamson's contribution

[Sep 3, 2015 8:05:39 AM]

pvh513
Senior Cruncher
Joined: Feb 26, 2011
Post Count: 260
Status: Offline
Project Badges:

14 day badge for Discovering Dengue Drugs - Together - Phase 2

20 year badge for The Clean Energy Project - Phase 2

5 year badge for GO Fight Against Malaria

200 year badge for Mapping Cancer Markers

20 year badge for Uncovering Genome Mysteries

50 year badge for FightAIDS@Home - Phase 2

100 year badge for Smash Childhood Cancer

100 year badge for Microbiome Immunity Project

100 year badge for Africa Rainfall Project

200 year badge for OpenPandemics - COVID-19


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

I recently received a beta WU and decided to test it by suspending it with LAIM disabled. Before the suspend, checkpoints were done every ~35 minutes

[18:08:19] INFO: Checkpointed. Progress 1000 of 100000 steps complete CPU time 2091.835000
[18:43:57] INFO: Checkpointed. Progress 2000 of 100000 steps complete CPU time 4143.176000
[19:19:17] INFO: Checkpointed. Progress 3000 of 100000 steps complete CPU time 6164.136000
[19:54:12] INFO: Checkpointed. Progress 4000 of 100000 steps complete CPU time 8138.962000
[20:29:12] INFO: Checkpointed. Progress 5000 of 100000 steps complete CPU time 10115.860000
[21:04:21] INFO: Checkpointed. Progress 6000 of 100000 steps complete CPU time 12121.109000
[21:39:04] INFO: Checkpointed. Progress 7000 of 100000 steps complete CPU time 14100.897000

After the resume that increased to every ~67 minutes:

[22:48:01] INFO: Checkpointed. Progress 8000 of 100000 steps complete CPU time 18051.571000
[23:55:19] INFO: Checkpointed. Progress 9000 of 100000 steps complete CPU time 21944.705000
[01:02:41] INFO: Sending trickle message to server.
[01:02:41] INFO: Starting intermediate upload, index = 1
[01:02:41] INFO: Checkpointed. Progress 10000 of 100000 steps complete CPU time 25768.012000
[02:10:55] INFO: Checkpointed. Progress 11000 of 100000 steps complete CPU time 29670.910000
[03:16:55] INFO: Checkpointed. Progress 12000 of 100000 steps complete CPU time 33558.779000
[04:22:10] INFO: Checkpointed. Progress 13000 of 100000 steps complete CPU time 37392.173000
... etc ...

So it appears that the suspend/resume cycle pretty much doubled the CPU time per checkpoint step! This client runs under openSUSE 13.2 on an Opteron 6168. WU name: BETA_avx101118-096_r11_1_wcgfahb00300000_0. As a result it will almost certainly not make the deadline, but I will let it continue running.

[Sep 3, 2015 10:55:24 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Same behaviour under Windows 7 sp 1 running on an i7 2600K.
Checkpoints done every ~700 seconds and then every ~1400 seconds after the restart :

[19:43:59] INFO: Checkpointed. Progress 10000 of 100000 steps complete CPU time 6999.936471
[19:55:36] INFO: Checkpointed. Progress 11000 of 100000 steps complete CPU time 7685.607666
[20:07:49] INFO: Checkpointed. Progress 12000 of 100000 steps complete CPU time 8397.019027
[20:19:18] INFO: Checkpointed. Progress 13000 of 100000 steps complete CPU time 9084.031831
[20:30:45] INFO: Checkpointed. Progress 14000 of 100000 steps complete CPU time 9761.076171
[20:42:06] INFO: Checkpointed. Progress 15000 of 100000 steps complete CPU time 10435.000491
[20:53:42] INFO: Checkpointed. Progress 16000 of 100000 steps complete CPU time 11128.596537
[21:05:42] INFO: Checkpointed. Progress 17000 of 100000 steps complete CPU time 11844.329125
[23:03:58] INFO:Turning trickle messaging on.
[23:03:58] INFO:Turning intermediate uploads on.
 %IMPACT-I:  Softcore binding energy with umax =     1000.00000
 %IMPACT-I: Using AGBNP2: Analytical Generalized Born Model + Analytic 
 Non-Polar Hydration Model
 %IMPACT-I:  Hybrid potential for binding with lambda =        0.00480
agbnpf_assign_parameters(): info: attempting to load from SQL tables.
[23:29:10] INFO: Checkpointed. Progress 18000 of 100000 steps complete CPU time 13311.600202
[23:52:50] INFO: Checkpointed. Progress 19000 of 100000 steps complete CPU time 14719.197225
[00:16:10] INFO: Sending trickle message to server.
[00:16:10] INFO: Starting intermediate upload, index = 2
[00:16:10] INFO: Checkpointed. Progress 20000 of 100000 steps complete CPU time 16099.150871
[00:39:33] INFO: Checkpointed. Progress 21000 of 100000 steps complete CPU time 17482.411738
[01:03:05] INFO: Checkpointed. Progress 22000 of 100000 steps complete CPU time 18877.949883
[01:26:27] INFO: Checkpointed. Progress 23000 of 100000 steps complete CPU time 20272.146420
[01:50:06] INFO: Checkpointed. Progress 24000 of 100000 steps complete CPU time 21679.571842
[02:13:59] INFO: Checkpointed. Progress 25000 of 100000 steps complete CPU time 23102.300962

WU name : BETA_avx101118-060_r4_1_wcgfahb00300000_0

[Sep 3, 2015 7:46:27 PM]

uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

The researchers have identified the problem with cpu time increasing. They have supplied us with a fix that we will be testing on alpha soon.

Thanks,
-Uplinger

[Sep 3, 2015 9:03:58 PM]

Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Recently Active
Project Badges:

180 day badge for Human Proteome Folding - Phase 2

14 day badge for Nutritious Rice for the World

180 day badge for The Clean Energy Project - Phase 2

1 year badge for Computing for Clean Water

90 day badge for Drug Search for Leishmaniasis

90 day badge for GO Fight Against Malaria

45 day badge for Computing for Sustainable Water

10 year badge for Mapping Cancer Markers

5 year badge for Outsmart Ebola Together

2 year badge for Microbiome Immunity Project

2 year badge for Africa Rainfall Project


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

Great to hear Uplinger, thanks for the news

----------------------------------------

[Sep 3, 2015 9:37:32 PM]

Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Recently Active
Project Badges:


Re: New Beta Test for PC v7.10 - August 25, 2015 [ Issues Thread ]

For the super micro-managers, though I don't think this overrides the "don't need, cache full". Hitting update while selecting WCG will 'request' work from WCG even though it's really not it's turn if you have more than one active project attached to the client:

<fetch_on_update></fetch_on_update>
When updating a project, request work even if not highest priority project. +New in 7.0.54

There were some bugged point releases that actually would fetch 1 unit at the time, again and again and again, but that's for the silly who want to over-commit their client(s).

Anyway if this works, please keep it a [public] secret. wink

Since this is a public secret can somebody remind me roughly where the fetch_on_update line goes please?

----------------------------------------

[Sep 3, 2015 9:46:19 PM]

[ ]