World Community Grid - View Thread - BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

World Community Grid Forums

Category: Support

Forum: GPU Support Forum

Thread: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Quick Go »

No member browsing this thread

Thread Status: Active
Total posts in this thread: 25

[ ]

Author

This topic has been viewed 5719 times and has 24 replies

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

This is for the adventurous [and the patient as there was an oops at the dev list as it was forgotten to be passed into the 7.0.53 build (got it via Locutus on Ubuntu nightliesof 13.04]. This <options> in the cc_config.xml will be overriding the fetch priority, meaning if a project is *not* the highest on the new work fetch list and you get the "don't need" reason or you have for some unknown reason an idling resource such as a GPU [devs are puzzled still], this will force to fetch work from the selected projected [if they have that what you want].

Anyway, 7.0.53 with the oops gave an "unknown tag" error, so have to wait for 7.0.54 or whatever next build there will be [there will be as the BOINC Manager is getting a debug drive at the moment]. As noted got this now running on Linux in a Live install mode [via Unetbootin tool], and this nightly of raring ringtail is slickest of all. Had BOINC running in few minutes, and low behold this one does much better on the 8 thread laptop... an hour running and still 6C cooler than earlier test releases on full out.

Test it when you have the issue and insist on being boss over the BOINC scheduler ;>)
(of course always at your own peril).

posted from Firefox 19 on ubu 13.04 nightly.

[Mar 5, 2013 1:16:23 PM]

Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:

5 year badge for Human Proteome Folding - Phase 2

2 year badge for Discovering Dengue Drugs - Together

2 year badge for Nutritious Rice for the World

180 day badge for The Clean Energy Project

5 year badge for Help Fight Childhood Cancer

1 year badge for Influenza Antiviral Drug Search

5 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Discovering Dengue Drugs - Together - Phase 2

2 year badge for The Clean Energy Project - Phase 2

5 year badge for Computing for Clean Water

2 year badge for Drug Search for Leishmaniasis

5 year badge for GO Fight Against Malaria

2 year badge for Computing for Sustainable Water

2 year badge for Uncovering Genome Mysteries

2 year badge for Outsmart Ebola Together

2 year badge for FightAIDS@Home - Phase 2

10 year badge for Microbiome Immunity Project

45 day badge for Africa Rainfall Project

20 year badge for OpenPandemics - COVID-19


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Politicians must spend money and BOINC Developers must make new code. It's in the genes. laughing

----------------------------------------

[Mar 5, 2013 2:56:14 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Let me say, I'm not convinced of either.

In an addendum, someone complaining over the DCF being 1.0 at WCG [to get ahold/control the TTC for high variability of run times... HCCs now do between 1:59 and 4:10 on the same machine] we will have it disappear from the project properties screen if the function was disabled by the servers with <dont_use_dcf>.

- manager: don't show DCF in project properties page if it's 1

In a way we now have a server controlled DCF *per-science-app* instead of one single DCF for all sciences, making a mess out of caching, particularly if set larger than default. The users who've adopted a v7 client, pref higher than 7.0.28 have the benefit already.

[Mar 5, 2013 8:45:02 PM]

JacobKlein
Cruncher
Joined: Aug 21, 2007
Post Count: 21
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

I think I was the "someone complaining". I didn't mean for it to come across as complaining -- My question though was completely legitimate and genuine. I quoted it below.
I didn't know that the server could use <dont_use_dcf>... and apparently WCG does set that flag, so DCF isn't used.
Also, for me, there was no variance in task duration. I only get WCG HCC GPU tasks. They ALWAYS take ~12.5 minutes, each and every time. And they ALWAYS had an initial estimate of 00:01:23 in BOINC Manager.

If there's currently no way to improve that behavior on the client, then it is what it is, however unfortunate. But I reported my question because I thought there was a bug with DCF not being updated; and apparently there wasn't a bug.

I have 98 NVIDIA World Community Grid (WCG) Help Conquer Cancer (HCC) tasks.
All of them say:
Estimated computation size: 25456 GFLOPs
Remaining (estimated): 00:01:23

Yet, the tasks actually take about 0.21 hours = 00:12:36.

The mechanism to take care of this issue, I thought, was Duration Correction Factor.
But, for World Community Grid, my Duration correction factor is 1.0000, and does not appear to ever change.

Is there a bug that prevents Duration correction factor from being updated appropriately?

Note: I'm using the very latest code (which Rom compiled today I believe).

[Mar 5, 2013 9:53:21 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Think I may have said it before, a suspicion. The premiss of the TTC is on 1 card, 1 job. Did not get a response [i can recollect], but if multiple run off 1 card their TTC would maybe still compute as running alone. Get's confusing, but if a GPU fraction is set in either app_info.xml or app_config.xml the TTC maybe could get divvied by that gpu_usage fraction. Wild guess.

This <dont_use_dcf> has been in use for a number of months now, probably shortly after the HCC-GPU launch, in Oct.2012. Techs would be able to tell [knreed is intimate with the matter]. Quick search makes first mention in this post: https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,33974 Oct.11.2012 at which end knreed actually comments on an observation.

cheers

[Mar 5, 2013 10:17:35 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

<fetch_on_update> is now operational in test client 7.0.54. Try it, but be warned that already some bugs are being reported. The announcement with change/add details: http://boinc.berkeley.edu/dev/forum_thread.php?id=6698&postid=48094

edit: P.S. Will of course install myself as soon as the CEP2 task has checkpointed... 3 hours into job 12, just to see how it behaves on a W7-64 :D

edit2: Installed fine, the usual slow initial connect [caused by the 120 seconds set startup delay in cc_config.xml]. First completed task reported... the CEP2 job.

----------------------------------------
[Edit 2 times, last edit by Former Member at Mar 8, 2013 11:27:31 AM]

[Mar 8, 2013 10:55:53 AM]

Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1404
Status: Offline
Project Badges:

2 year badge for Human Proteome Folding - Phase 2

90 day badge for Discovering Dengue Drugs - Together

1 year badge for Nutritious Rice for the World

90 day badge for The Clean Energy Project

2 year badge for Help Fight Childhood Cancer

90 day badge for Influenza Antiviral Drug Search

2 year badge for Help Cure Muscular Dystrophy - Phase 2

2 year badge for Computing for Clean Water

2 year badge for GO Fight Against Malaria

20 year badge for Mapping Cancer Markers

20 year badge for Outsmart Ebola Together

20 year badge for FightAIDS@Home - Phase 2

20 year badge for Smash Childhood Cancer

5 year badge for Microbiome Immunity Project

10 year badge for Africa Rainfall Project

50 year badge for OpenPandemics - COVID-19


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Also re-reading the app_config.xml files when reading the cc_config is introduced smile

[Mar 8, 2013 12:20:51 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Great revelatory catch. Kudos to the devs [and an influencor with cloud who shall not be named ;O]. Of course ad-hoc tested by changing the permitted HCC from 8 to 7 max and re-read config and voilá, 1 HCC job got immediately pre-empted. Reversed the action and task resumed. Ottimo!

edit: To document the observation, which is loss-less for CPU jobs with LAIM on:

184 3/8/2013 12:25:06 PM Re-reading cc_config.xml
188 3/8/2013 12:25:06 PM Config: fetch on update
189 3/8/2013 12:25:06 PM Config: don't compute while prio_uninstall.exe is running
190 3/8/2013 12:25:06 PM Config: GUI RPC allowed from any host
191 3/8/2013 12:25:06 PM log flags: file_xfer, sched_ops, task, checkpoint_debug, cpu_sched, dcf_debug
192 3/8/2013 12:25:06 PM log flags: sched_op_debug
193 World Community Grid 3/8/2013 12:25:06 PM Found app_config.xml
194 World Community Grid 3/8/2013 12:25:06 PM [cpu_sched] Preempting X0900120400829201005141334_1 (left in memory)
195 World Community Grid 3/8/2013 12:25:41 PM [checkpoint] result X0900120400881201005141333_1 checkpointed
196 World Community Grid 3/8/2013 12:27:38 PM [checkpoint] result X0900119890191201004271253_0 checkpointed
197 World Community Grid 3/8/2013 12:28:12 PM [checkpoint] result X0900119890215201004271253_0 checkpointed
198 World Community Grid 3/8/2013 12:29:06 PM [checkpoint] result X0900119890121201004271254_1 checkpointed
199 World Community Grid 3/8/2013 12:29:18 PM [checkpoint] result X0900119890205201004271253_1 checkpointed
200 World Community Grid 3/8/2013 12:30:19 PM [checkpoint] result X0900119890193201004271253_0 checkpointed
201 World Community Grid 3/8/2013 12:30:59 PM [checkpoint] result X0900119890191201004271253_0 checkpointed
202 World Community Grid 3/8/2013 12:31:24 PM [checkpoint] result X0900119890125201004271254_0 checkpointed
203 3/8/2013 12:31:28 PM Re-reading cc_config.xml
207 3/8/2013 12:31:28 PM Config: fetch on update
208 3/8/2013 12:31:28 PM Config: don't compute while prio_uninstall.exe is running
209 3/8/2013 12:31:28 PM Config: GUI RPC allowed from any host
210 3/8/2013 12:31:28 PM log flags: file_xfer, sched_ops, task, checkpoint_debug, cpu_sched, dcf_debug
211 3/8/2013 12:31:28 PM log flags: sched_op_debug
212 World Community Grid 3/8/2013 12:31:28 PM Found app_config.xml
213 World Community Grid 3/8/2013 12:31:29 PM [cpu_sched] Resuming X0900120400829201005141334_1
214 World Community Grid 3/8/2013 12:31:29 PM Resuming task X0900120400829201005141334_1 using hcc1 version 705 in slot 10
215 World Community Grid 3/8/2013 12:31:39 PM [checkpoint] result X0900120400829201005141334_1 checkpointed

The message text maybe to say that cc_config.xml & app_config.xml were re-read, but the "Found app_config.xml" is enough of a confirmation to me, plus it's listed of which project the app_config.xml is found, where cc_config.xml is global.

----------------------------------------
[Edit 3 times, last edit by Former Member at Mar 8, 2013 12:38:35 PM]

[Mar 8, 2013 12:28:21 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

And Gianfranco aka locutusofborg is amazing... already made a ppa package, so off to test on 13.04 Raring Ringtail [and see if it installs without probs], where the VINA jobs are flying on Linux... absolutely flying.

edit: And up we are, a smooth install, auto stop, upgrade, start in under 1 minute... it's a 32 bit build though, but that does not matter since the sciences do not depend on that. They still run 64 bit, which FAAH is now too since the last AutoDock upgrade.

----------------------------------------
[Edit 1 times, last edit by Former Member at Mar 8, 2013 2:08:08 PM]

[Mar 8, 2013 1:53:57 PM]

Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline


Re: BOINC and new <fetch_on_update> in soon to come 7.0.54 (with a Q-mark)

Just as update, small bugs fixed in 7.0.55... getting serious about a new public release [at Berkeley after 14 days or so of field testing being named "recommended"].

One [Important] note I've encountered since about 7.0.28 on Linux and maybe few can test that too, since end of March is approaching and for many summertime kicks in: Setting the clock back elicits a message in BOINC that clock was changed and counters were reset, but for me on many tests this leads to processing freezing in it's tracks [no CPU ticks clocked]. Exiting the client, stopping the daemon and restarting made it run again. Sunday's test on first Linux build of 7.0.54 was different. On restart the tasks resumed but then after a minute all running tasks were put in ""Scheduler wait: Waiting to acquire lock" and new tasks were started and ran properly. No amount of trying got them running again, until I upgraded to a new 7.0.54 build, and suddenly these tasks resumed. So when you get to try 7.0.55 and stay with it on Linux, be aware!

Release notes: http://boinc.berkeley.edu/dev/forum_thread.php?id=6698#48160

[Mar 12, 2013 10:51:21 AM]

[ ]