| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 25
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This is for the adventurous [and the patient as there was an oops at the dev list as it was forgotten to be passed into the 7.0.53 build (got it via Locutus on Ubuntu nightliesof 13.04]. This <options> in the cc_config.xml will be overriding the fetch priority, meaning if a project is *not* the highest on the new work fetch list and you get the "don't need" reason or you have for some unknown reason an idling resource such as a GPU [devs are puzzled still], this will force to fetch work from the selected projected [if they have that what you want].
Anyway, 7.0.53 with the oops gave an "unknown tag" error, so have to wait for 7.0.54 or whatever next build there will be [there will be as the BOINC Manager is getting a debug drive at the moment]. As noted got this now running on Linux in a Live install mode [via Unetbootin tool], and this nightly of raring ringtail is slickest of all. Had BOINC running in few minutes, and low behold this one does much better on the 8 thread laptop... an hour running and still 6C cooler than earlier test releases on full out. Test it when you have the issue and insist on being boss over the BOINC scheduler ;>) (of course always at your own peril). posted from Firefox 19 on ubu 13.04 nightly. |
||
|
|
Dataman
Ace Cruncher Joined: Nov 16, 2004 Post Count: 4865 Status: Offline Project Badges:
|
Politicians must spend money and BOINC Developers must make new code. It's in the genes.
---------------------------------------- ![]() ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Let me say, I'm not convinced of either.
In an addendum, someone complaining over the DCF being 1.0 at WCG [to get ahold/control the TTC for high variability of run times... HCCs now do between 1:59 and 4:10 on the same machine] we will have it disappear from the project properties screen if the function was disabled by the servers with <dont_use_dcf>. - manager: don't show DCF in project properties page if it's 1 In a way we now have a server controlled DCF *per-science-app* instead of one single DCF for all sciences, making a mess out of caching, particularly if set larger than default. The users who've adopted a v7 client, pref higher than 7.0.28 have the benefit already. |
||
|
|
JacobKlein
Cruncher Joined: Aug 21, 2007 Post Count: 21 Status: Offline |
I think I was the "someone complaining". I didn't mean for it to come across as complaining -- My question though was completely legitimate and genuine. I quoted it below.
I didn't know that the server could use <dont_use_dcf>... and apparently WCG does set that flag, so DCF isn't used. Also, for me, there was no variance in task duration. I only get WCG HCC GPU tasks. They ALWAYS take ~12.5 minutes, each and every time. And they ALWAYS had an initial estimate of 00:01:23 in BOINC Manager. If there's currently no way to improve that behavior on the client, then it is what it is, however unfortunate. But I reported my question because I thought there was a bug with DCF not being updated; and apparently there wasn't a bug. I have 98 NVIDIA World Community Grid (WCG) Help Conquer Cancer (HCC) tasks. All of them say: Estimated computation size: 25456 GFLOPs Remaining (estimated): 00:01:23 Yet, the tasks actually take about 0.21 hours = 00:12:36. The mechanism to take care of this issue, I thought, was Duration Correction Factor. But, for World Community Grid, my Duration correction factor is 1.0000, and does not appear to ever change. Is there a bug that prevents Duration correction factor from being updated appropriately? Note: I'm using the very latest code (which Rom compiled today I believe). |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Think I may have said it before, a suspicion. The premiss of the TTC is on 1 card, 1 job. Did not get a response [i can recollect], but if multiple run off 1 card their TTC would maybe still compute as running alone. Get's confusing, but if a GPU fraction is set in either app_info.xml or app_config.xml the TTC maybe could get divvied by that gpu_usage fraction. Wild guess.
This <dont_use_dcf> has been in use for a number of months now, probably shortly after the HCC-GPU launch, in Oct.2012. Techs would be able to tell [knreed is intimate with the matter]. Quick search makes first mention in this post: https://secure.worldcommunitygrid.org/forums/wcg/viewthread_thread,33974 Oct.11.2012 at which end knreed actually comments on an observation. cheers |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
<fetch_on_update> is now operational in test client 7.0.54. Try it, but be warned that already some bugs are being reported. The announcement with change/add details: http://boinc.berkeley.edu/dev/forum_thread.php?id=6698&postid=48094
----------------------------------------edit: P.S. Will of course install myself as soon as the CEP2 task has checkpointed... 3 hours into job 12, just to see how it behaves on a W7-64 :D edit2: Installed fine, the usual slow initial connect [caused by the 120 seconds set startup delay in cc_config.xml]. First completed task reported... the CEP2 job. [Edit 2 times, last edit by Former Member at Mar 8, 2013 11:27:31 AM] |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1404 Status: Offline Project Badges:
|
Also re-reading the app_config.xml files when reading the cc_config is introduced
![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Great revelatory catch. Kudos to the devs [and an influencor with cloud who shall not be named ;O]. Of course ad-hoc tested by changing the permitted HCC from 8 to 7 max and re-read config and voilá, 1 HCC job got immediately pre-empted. Reversed the action and task resumed. Ottimo!
----------------------------------------edit: To document the observation, which is loss-less for CPU jobs with LAIM on: 184 3/8/2013 12:25:06 PM Re-reading cc_config.xml 188 3/8/2013 12:25:06 PM Config: fetch on update 189 3/8/2013 12:25:06 PM Config: don't compute while prio_uninstall.exe is running 190 3/8/2013 12:25:06 PM Config: GUI RPC allowed from any host 191 3/8/2013 12:25:06 PM log flags: file_xfer, sched_ops, task, checkpoint_debug, cpu_sched, dcf_debug 192 3/8/2013 12:25:06 PM log flags: sched_op_debug 193 World Community Grid 3/8/2013 12:25:06 PM Found app_config.xml 194 World Community Grid 3/8/2013 12:25:06 PM [cpu_sched] Preempting X0900120400829201005141334_1 (left in memory) 195 World Community Grid 3/8/2013 12:25:41 PM [checkpoint] result X0900120400881201005141333_1 checkpointed 196 World Community Grid 3/8/2013 12:27:38 PM [checkpoint] result X0900119890191201004271253_0 checkpointed 197 World Community Grid 3/8/2013 12:28:12 PM [checkpoint] result X0900119890215201004271253_0 checkpointed 198 World Community Grid 3/8/2013 12:29:06 PM [checkpoint] result X0900119890121201004271254_1 checkpointed 199 World Community Grid 3/8/2013 12:29:18 PM [checkpoint] result X0900119890205201004271253_1 checkpointed 200 World Community Grid 3/8/2013 12:30:19 PM [checkpoint] result X0900119890193201004271253_0 checkpointed 201 World Community Grid 3/8/2013 12:30:59 PM [checkpoint] result X0900119890191201004271253_0 checkpointed 202 World Community Grid 3/8/2013 12:31:24 PM [checkpoint] result X0900119890125201004271254_0 checkpointed 203 3/8/2013 12:31:28 PM Re-reading cc_config.xml 207 3/8/2013 12:31:28 PM Config: fetch on update 208 3/8/2013 12:31:28 PM Config: don't compute while prio_uninstall.exe is running 209 3/8/2013 12:31:28 PM Config: GUI RPC allowed from any host 210 3/8/2013 12:31:28 PM log flags: file_xfer, sched_ops, task, checkpoint_debug, cpu_sched, dcf_debug 211 3/8/2013 12:31:28 PM log flags: sched_op_debug 212 World Community Grid 3/8/2013 12:31:28 PM Found app_config.xml 213 World Community Grid 3/8/2013 12:31:29 PM [cpu_sched] Resuming X0900120400829201005141334_1 214 World Community Grid 3/8/2013 12:31:29 PM Resuming task X0900120400829201005141334_1 using hcc1 version 705 in slot 10 215 World Community Grid 3/8/2013 12:31:39 PM [checkpoint] result X0900120400829201005141334_1 checkpointed The message text maybe to say that cc_config.xml & app_config.xml were re-read, but the "Found app_config.xml" is enough of a confirmation to me, plus it's listed of which project the app_config.xml is found, where cc_config.xml is global. [Edit 3 times, last edit by Former Member at Mar 8, 2013 12:38:35 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
And Gianfranco aka locutusofborg is amazing... already made a ppa package, so off to test on 13.04 Raring Ringtail [and see if it installs without probs], where the VINA jobs are flying on Linux... absolutely flying.
----------------------------------------edit: And up we are, a smooth install, auto stop, upgrade, start in under 1 minute... it's a 32 bit build though, but that does not matter since the sciences do not depend on that. They still run 64 bit, which FAAH is now too since the last AutoDock upgrade. [Edit 1 times, last edit by Former Member at Mar 8, 2013 2:08:08 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just as update, small bugs fixed in 7.0.55... getting serious about a new public release [at Berkeley after 14 days or so of field testing being named "recommended"].
One [Important] note I've encountered since about 7.0.28 on Linux and maybe few can test that too, since end of March is approaching and for many summertime kicks in: Setting the clock back elicits a message in BOINC that clock was changed and counters were reset, but for me on many tests this leads to processing freezing in it's tracks [no CPU ticks clocked]. Exiting the client, stopping the daemon and restarting made it run again. Sunday's test on first Linux build of 7.0.54 was different. On restart the tasks resumed but then after a minute all running tasks were put in ""Scheduler wait: Waiting to acquire lock" and new tasks were started and ran properly. No amount of trying got them running again, until I upgraded to a new 7.0.54 build, and suddenly these tasks resumed. So when you get to try 7.0.55 and stay with it on Linux, be aware! Release notes: http://boinc.berkeley.edu/dev/forum_thread.php?id=6698#48160 |
||
|
|
|