World Community Grid - View Thread - CEP2 beta for windows

World Community Grid Forums

Category: Beta Testing

Forum: Beta Test Support Forum

Thread: CEP2 beta for windows - Version 6.25

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 311

[ ]

Author

This topic has been viewed 1272412 times and has 310 replies

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: CEP2 beta for windows - Version 6.25

As far as the bandwidth limits go, we are likely going to be adopting the 6.10 client shortly. It includes a feature that lets you set your max data transferred. We will modify the server so that if you are using a 6.10 client and have set that field then we will ignore the bandwidth check. This will make sure that people are protected from excess data transfers.

To translate this how it is shown in the 6.10 preferences, not yet integrated into the WCG device profiles is:

- Goto to advanced preferences, network tab
- Enter a value of Max Bandwidth use, which for my duo I've set to 400MB (up+dw)
- Enter the period over which the limit should work. I've set 1 day
- Specifically set a limit on the upload BW, e.g. 128kB (which for CEP2 then limits concurrent upload speed to 128 / X-files allowed concurrently, default 2 i.e. 64kB if 2 of the _4 result files upload).

Limiting upload speed leaves room for improved download speed as measured by BOINC. There is a cross-effect where DownLoad is impacted by UpLoad throughput.

Note that this is a per-device. Anyone having multiple devices and actually has ISP overall period limits will need to calculate how much the restriction per device needs to be, sized to the number of cores. Of course if you have no limits, set the value as high as you like. At any rate, set any value and the WCG server will be content to ignore the bwdown speed value, for WCG!

See FAQ of July: http://www.worldcommunitygrid.org/forums/wcg/...ead,29406_offset,0#287348 which will be updated shortly with a new local prefs screenshot and the periodic bandwidth budget, not yet available in 6.10.17, but is in 6.10.58

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Sep 22, 2010 7:56:16 AM]

Hypernova
Master Cruncher
Audaces Fortuna Juvat ! Vaud - Switzerland
Joined: Dec 16, 2008
Post Count: 1908
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

20 year badge for Help Fight Childhood Cancer

14 day badge for Influenza Antiviral Drug Search

20 year badge for Help Cure Muscular Dystrophy - Phase 2

90 day badge for Discovering Dengue Drugs - Together - Phase 2

5 year badge for The Clean Energy Project - Phase 2

20 year badge for Computing for Clean Water

10 year badge for Drug Search for Leishmaniasis

10 year badge for GO Fight Against Malaria

5 year badge for Computing for Sustainable Water


Re: CEP2 beta for windows - Version 6.25

I just checked back from travel my Beta status and I see that my Saturn device picked a few of them. In total 23 of them.
Unfortunately here is the present status:

10 Error, 0 CPU time, 0.0/0.0
1 Error, 3.79 CPU time 113.4/0.0
1 Error, 0.10 CPU time 2.8/0.0
2 Server Aborted 0 CPU time 0.0/0.0
6 In Progress
3 Valid

Saturn is W7 64 bit, 6GB RAM, I7 980X 4Ghz, HT on.

Here under the two logs of the typical error with 0 CPU time and with effective CPU time:

Result Log

Result Name: BETA_ E200365_ 738_ A.24.C19H12N2OS2.113.4.set1d06_ 1--

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[18:00:45] Number of jobs = 16
[18:00:45] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:10:41] Number of jobs = 16
[18:10:41] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:20:45] Number of jobs = 16
[18:20:45] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
.......
.......
.......
etc. etc. etc. (many many pages long)
.......
.......
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:47:20] Number of jobs = 16
[18:47:20] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[19:07:21] Number of jobs = 16
[19:07:21] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting

</stderr_txt>
]]>

And here the log of the one that errored after more than 3 hrs CPU time:

Result Log

Result Name: BETA_ E200367_ 985_ A.24.C19H12N2S3.194.1.set1d06_ 0--

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[16:57:23] Number of jobs = 16
[16:57:23] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[17:37:18] Number of jobs = 16
[17:37:18] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:17:18] Number of jobs = 16
[18:17:18] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:57:20] Number of jobs = 16
[18:57:20] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
...........
...........
...........
INFO: No state to restore. Start from the beginning.
[18:09:27] Number of jobs = 16
[18:09:27] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:19:27] Number of jobs = 16
[18:19:27] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:29:27] Number of jobs = 16
[18:29:27] Starting job 0,CPU time has been restored to 0.000000.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
INFO: No state to restore. Start from the beginning.
[18:39:29] Number of jobs = 16
[18:39:29] Starting job 0,CPU time has been restored to 0.000000.
[18:41:11] Finished Job #0
[18:41:11] Starting job 1,CPU time has been restored to 96.564619.
[18:46:26] Finished Job #1
[18:46:26] Starting job 2,CPU time has been restored to 364.340335.
[20:24:12] Finished Job #2
[20:24:12] Starting job 3,CPU time has been restored to 5708.014190.
[20:29:46] Finished Job #3
[20:29:46] Starting job 4,CPU time has been restored to 6006.007300.
[20:33:10] Finished Job #4
[20:33:10] Starting job 5,CPU time has been restored to 6207.934994.
[20:36:58] Finished Job #5
[20:36:58] Starting job 6,CPU time has been restored to 6422.545570.
[20:40:22] Finished Job #6
[20:40:22] Starting job 7,CPU time has been restored to 6623.989661.
[20:45:05] Finished Job #7
[20:45:05] Starting job 8,CPU time has been restored to 6892.638983.
[20:48:23] Finished Job #8
[20:48:23] Starting job 9,CPU time has been restored to 7088.357838.
[20:52:00] Finished Job #9
[20:52:00] Starting job 10,CPU time has been restored to 7303.046414.
[20:59:47] Finished Job #10
[20:59:47] Starting job 11,CPU time has been restored to 7755.340113.
[21:04:07] Finished Job #11
[21:04:07] Starting job 12,CPU time has been restored to 8002.820100.
[21:38:32] Finished Job #12
[21:38:32] Starting job 13,CPU time has been restored to 10035.528730.
[22:38:56] Finished Job #13
[22:38:56] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[23:29:10] Number of jobs = 16
[23:29:10] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[23:39:11] Number of jobs = 16
[23:39:11] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[23:49:11] Number of jobs = 16
[23:49:11] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
..........
..........
...........

[15:20:18] Number of jobs = 16
[15:20:18] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[15:30:19] Number of jobs = 16
[15:30:19] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[15:40:20] Number of jobs = 16
[15:40:20] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[15:50:20] Number of jobs = 16
[15:50:21] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting
[16:00:21] Number of jobs = 16
[16:00:21] Starting job 14,CPU time has been restored to 13655.921937.
No heartbeat from core client for 30 sec - exiting
No heartbeat: Exiting

</stderr_txt>
]]>

----------------------------------------

----------------------------------------
[Edit 2 times, last edit by Hypernova at Sep 22, 2010 8:29:49 AM]

[Sep 22, 2010 8:26:46 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: CEP2 beta for windows - Version 6.25

I hope this is not another cruncher unfriendly project

.

I dont no ! It´s nothing at me. crying

Greetings

[Sep 22, 2010 8:55:25 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: CEP2 beta for windows - Version 6.25

hmmmm, maybe I did something stupid. I will try it again later. Thx X-Files

Same like me ! tongue

So, enough critic about the Bandwidth.

Greetings

[Sep 22, 2010 8:59:53 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: CEP2 beta for windows - Version 6.25

CEP2 is a "Read the My Projects preferences page with explicit invitation to consider the system requirement, manually opt-in" research project!

Cruncher, be aware before selecting and as with any new application something that might need to have it's exceptions set in the security software. No Heartbeat does mean that the system was very busy, probably mighty busy with IO running 6 concurrent. Maybe the rule could be considered to be n / 3 cores rounded up for fractions so a duo still gets 1 and a regular quad 2 but a Hex getting 4 instead of 6 with HT switched on. Certainly running 4 concurrent on my Linux box did not kill anything, only valids returned, but lower efficiency.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

----------------------------------------
[Edit 1 times, last edit by Sekerob at Sep 22, 2010 9:19:28 AM]

[Sep 22, 2010 9:13:21 AM]

gb077492
Advanced Cruncher
Joined: Dec 24, 2004
Post Count: 96
Status: Offline


Re: CEP2 beta for windows - Version 6.25

I'm definitely not convinced about this download bandwidth measurement.

I have a four-way machine that was running a couple of betas. By the time one of them finished the only additional downloads it had done were a single DDT2 job and the master file. Yet I still got a message to say the bandwidth was too low. I downloaded an HPF2 WU which "fixed" it and it then it could get a couple of new beta WUs.

This machine is on the IBM internal network, which is no slouch!

[Sep 22, 2010 10:37:33 AM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: CEP2 beta for windows - Version 6.25

I'm definitely not convinced about this download bandwidth measurement.

I agree with you.

[Sep 22, 2010 11:49:09 AM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: CEP2 beta for windows - Version 6.25

gb077492, it's the small(er) files that cause the download speed rating to deteriorate rapidly, particularly a series. HPF2 tasks take long enough (1 MB or more) to lift the mean measured speed up to it's near true value, hence why knreed is implementing this work around.

Some projects outside WCG seem to have "real" atrociously poor speeds on the DL so when these clients come to WCG they get affected too. Of course WCG would not up front want to shut out new arrivals.

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Sep 22, 2010 11:50:02 AM]

gb077492
Advanced Cruncher
Joined: Dec 24, 2004
Post Count: 96
Status: Offline


Re: CEP2 beta for windows - Version 6.25

HPF2 tasks take long enough (1 MB or more) to lift the mean measured speed up to it's near true value

I deliberately chose the project with the largest download file size on that assumption.

it's the small(er) files that cause the download speed rating to deteriorate rapidly

So only measure the speed for files larger than some threshold and ignore the smaller ones as noise. Wouldn't that help?

Mike

[Sep 22, 2010 12:32:54 PM]

Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline


Re: CEP2 beta for windows - Version 6.25

Yes, but how to tell BOINC? WCG is not measuring, only the client and then tells WCG! A developer issue for Berkeley I'd say. As eluded, knreed is setting up a trick for client 6.10. Anyone who specifies a bandwidth budget (daily, bi-daily, weekly, monthly) will get big bandwidth needing tasks so those coming in from projects outside WCG that are having real low bandwidths (the projects themselves) in or only small files will be able to contribute too.

As for the C4CW comment somewhere on small files: There are no files exchanged once one has a C4CW task on the client of a target... nothing to measure. C4CW has a seed/departure point table and the server tells which seed to pick to get a new task popping up in the task list.

To summarize: The BETA test is again used to not only learn on the task but also on it's grid load behavior and techs working to facilitate least impairment without throwing the child out with the bathwater (no gauges).

----------------------------------------

WCG

Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!

[Sep 22, 2010 1:01:56 PM]

[ ]