Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Mapping Cancer Markers Beta Test - Version 7.27 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 120
|
Author |
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
http://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,35971
Please post issues here. Since this is testing checkpoint, if possible, please force your agent to checkpoint. To do this, turn "Leave Application In memory" off and then suspend the work unit. To make sure the work unit is out of memory, you can open 'Task Manager' or 'ps' to make sure the application does not show. Then resume it. Another easy method to force a restore from checkpoint is to reboot your machine. Thanks, -Uplinger |
||
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: |
I wanted to give everyone an update on the issues addressed and not addressed in this beta.
Thanks, armstrdj |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
OK, so in general accepted terms, LAIM OFF :D
----------------------------------------... and the tasks are flooding in. 100,000 originals would [idle hope] promise a second reload after completing the first set received. P.S. Now the dilemma... got 7.26 running on one machine on 8 cores. Unloading the client or rebooting is not an option or take an 8 invalid hit, so will just LAIM OFF > Suspend 7.27 Beta tasks and do the TM check to verify they've unloaded. [Edit 1 times, last edit by Former Member at Dec 9, 2013 7:04:31 PM] |
||
|
deltavee
Ace Cruncher Texas Hill Country Joined: Nov 17, 2004 Post Count: 4846 Status: Offline Project Badges: |
One day deadline.
----------------------------------------BETA_ MCM1_ 0000089_ 5561_ 0-- xxxxxxxxx In Progress 12/9/13 19:09:51 12/10/13 19:09:51 0.00 / 0.00 0.0 / 0.0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Here a sample stderr.txt of an 'unloaded' 7.27 after resume. 2 wcg_learn_limit = 50000 entries. The initialization time looks new... hope it sticks in the production version so folk have a feel how much 'Elapsed' was involved to permit a verification of the true wallclock the task ran versus recorded CPU time (some having impressions of disparity there). The BOINC client log file stdoutdae.txt would have revealed that too.
----------------------------------------Commandline = projects/www.worldcommunitygrid.org/wcgrid_beta17_7.27_windows_x86_64 -SettingsFile MCM1_0000089_4921.txt -DatabaseFile dataset-17_72_SDG_v1.txt Settings File DateOfDesign = 11/08/2013 Designer = PMCC_OCI WorkOrderID = 0000089_4921 DatasetID = 17_72_SDG_v1 NumberOfGenesInStartingSignature = 16 NumberOfGenesInSignatureMin = 10 NumberOfGenesInSignatureMax = 20 GroupVectorValues = {A}{B}{C}{D}{E}{F} ExplicitStartingGeneSignatures = A B D F StartingGeneSignatureAlgorithm = randomFixedLengthSearch SearchAlgorithmNumberToCreate = 1 SearchAlgorithmSequentialStartPosition = 5 RunPermutationAlgorithm = 1 PermutationGroups = A PermutationGroupsForReplacement = G PermutationAlgorithm = replaceFromRandomlyToRandomlySimulatedAnnealing PermutationsNumIterations = 7738 OptimizationAlgorithmFrequency = 0 0 1 FBeta = 1.5 SimAnnealIMax = 20000 SimAnnealAlpha = 0.9996 NReps = 10 TrainFrac = 0.7 NFolds = 10 VMethod = NFCV ModelType = SVM FitnessFn = 0 MinFitness = -1 SimAnnealEConv = 2 SvmArgs = "-v 0 -c 0.01 -t 1 -d 3 -r 0" SvmLearnLimit = 500000 RSeed = 344921 [20:07:35] Initializing wcg_learn_limit = 500000 [20:07:46] Running [20:07:46] EvaluateFitnessOfStartingGeneSignatures 1 [20:07:48]: Computing pass 0 Commandline = projects/www.worldcommunitygrid.org/wcgrid_beta17_7.27_windows_x86_64 -SettingsFile MCM1_0000089_4921.txt -DatabaseFile dataset-17_72_SDG_v1.txt [20:26:18] Initializing wcg_learn_limit = 500000 [20:26:29] Running [20:26:29] EvaluateFitnessOfStartingGeneSignatures 1 [20:26:31]: Computing pass 0 edit: The above was after 2 checkpoints had registered [Courtesy easyview with BOINCTasks] [Edit 1 times, last edit by Former Member at Dec 9, 2013 7:44:25 PM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
One day deadline. BETA_ MCM1_ 0000089_ 5561_ 0-- xxxxxxxxx In Progress 12/9/13 19:09:51 12/10/13 19:09:51 0.00 / 0.00 0.0 / 0.0 This setting was left by accident. I have changed it to a 4 day deadline. Thanks, -Uplinger |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Not an issue, just a couple of comments:
Let's hope we give them what they need to sort this one out! |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As edited into my previous post, did do a suspend/resume test after 2 checkpoints. Now done a second round of interrupts after the 10th+ checkpoint i.e. multiple interrupts.
A little devious: If you run BOINC as service, you wont see the science tasks running [In Task Manager], then mistakingly assume they were unloaded. You need to hit the "Show all user processes" button left bottom. Science apps have WCG or wcgrid at front. Also, one machine I'm not going to interrupt tasks on... just to make [quasi] sure when uplinger switches on the validator, that interrupted and non-interrupted get to meet [although, how many of the 100,000 will get intentionally interrupted] ;D |
||
|
Mumak
Senior Cruncher Joined: Dec 7, 2012 Post Count: 477 Status: Offline Project Badges: |
No issues to report so far, just good news.. Restarted BETA_ MCM1_ 0000089_ 1255_ 1-- and it's already Validated OK.
---------------------------------------- |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Apis,
Restarting anywhere in the middle of the work unit should be fine. Since I have validation turned on and assimilation turned off all of the results will stay around longer allowing us time to investigate issues with the increased logging. Thanks, -Uplinger |
||
|
|