| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 264
|
|
| Author |
|
|
Gliuck
Cruncher Italy Joined: Dec 14, 2005 Post Count: 8 Status: Offline Project Badges:
|
Server seems to be out of space at the moment, I'm not able to upload the big CEP2 results.
|
||
|
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges:
|
Wrong forum
----------------------------------------![]() ![]() Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 ![]() |
||
|
|
Gliuck
Cruncher Italy Joined: Dec 14, 2005 Post Count: 8 Status: Offline Project Badges:
|
Wrong forum ![]() You're right... ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm not sure when it starts, the checkpoint restart problem seems to be comming up again.
----------------------------------------I found PVAR WU results are increasing for my results. I have checked out all of thm and found my results do not involve the restart but the others like below. My result: ---------------------------------------------------------------- Result Log Result Name: MCM1_ 0001860_ 6118_ 0-- <core_client_version>7.2.33</core_client_version> <![CDATA[ <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.28_windows_x86_64 -SettingsFile MCM1_0001860_6118.txt -DatabaseFile dataset-17_72_SDG_v1.txt Settings File DateOfDesign = 11/08/2013 Designer = PMCC_OCI WorkOrderID = 1860_6118 DatasetID = 17_72_SDG_v1 NumberOfGenesInStartingSignature = 13 NumberOfGenesInSignatureMin = 10 NumberOfGenesInSignatureMax = 20 GroupVectorValues = {A}{B}{C}{D}{E}{F} ExplicitStartingGeneSignatures = A B D F StartingGeneSignatureAlgorithm = randomFixedLengthSearch SearchAlgorithmNumberToCreate = 28638 SearchAlgorithmSequentialStartPosition = 5 RunPermutationAlgorithm = 0 PermutationGroups = A PermutationGroupsForReplacement = G PermutationAlgorithm = replaceFromRandomlyToRandomlyGreedy PermutationsNumIterations = 28638 OptimizationAlgorithmFrequency = 0 0 1 FBeta = 1.5 SimAnnealIMax = 20000 SimAnnealAlpha = 0.9996 NReps = 9 TrainFrac = 1.0 NFolds = 10 VMethod = OOB ModelType = SVM FitnessFn = 0 MinFitness = -1 SvmArgs = "-v 0 -c 0.01 -t 1 -d 3 -r 0" SvmLearnLimit = 500000 RSeed = 379576118 [01:14:52] Initializing [01:14:58] Running [01:14:59] EvaluateFitnessOfStartingGeneSignatures 28638 [07:16:19] Writing final output [07:16:20] Closing Output Stream [07:16:20] Cleaning up Result.out = 5659313.000000 Run complete, CPU time: 21626.122228 07:16:20 (10004): called boinc_finish </stderr_txt> ]]> ------------------------------------------------- The other's ------------------------------------------------- Result Log Result Name: MCM1_ 0001860_ 6118_ 1-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.28_windows_x86_64 -SettingsFile MCM1_0001860_6118.txt -DatabaseFile dataset-17_72_SDG_v1.txt Settings File DateOfDesign = 11/08/2013 Designer = PMCC_OCI WorkOrderID = 1860_6118 DatasetID = 17_72_SDG_v1 NumberOfGenesInStartingSignature = 13 NumberOfGenesInSignatureMin = 10 NumberOfGenesInSignatureMax = 20 GroupVectorValues = {A}{B}{C}{D}{E}{F} ExplicitStartingGeneSignatures = A B D F StartingGeneSignatureAlgorithm = randomFixedLengthSearch SearchAlgorithmNumberToCreate = 28638 SearchAlgorithmSequentialStartPosition = 5 RunPermutationAlgorithm = 0 PermutationGroups = A PermutationGroupsForReplacement = G PermutationAlgorithm = replaceFromRandomlyToRandomlyGreedy PermutationsNumIterations = 28638 OptimizationAlgorithmFrequency = 0 0 1 FBeta = 1.5 SimAnnealIMax = 20000 SimAnnealAlpha = 0.9996 NReps = 9 TrainFrac = 1.0 NFolds = 10 VMethod = OOB ModelType = SVM FitnessFn = 0 MinFitness = -1 SvmArgs = "-v 0 -c 0.01 -t 1 -d 3 -r 0" SvmLearnLimit = 500000 RSeed = 379576118 [00:49:29] Initializing [00:49:35] Running [00:49:35] EvaluateFitnessOfStartingGeneSignatures 28638 Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.28_windows_x86_64 -SettingsFile MCM1_0001860_6118.txt -DatabaseFile dataset-17_72_SDG_v1.txt [01:58:53] Initializing [01:59:00] Running [01:59:00] EvaluateFitnessOfStartingGeneSignatures 28638 Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.28_windows_x86_64 -SettingsFile MCM1_0001860_6118.txt -DatabaseFile dataset-17_72_SDG_v1.txt [02:09:02] Initializing [02:09:09] Running [02:09:09] EvaluateFitnessOfStartingGeneSignatures 28638 Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.28_windows_x86_64 -SettingsFile MCM1_0001860_6118.txt -DatabaseFile dataset-17_72_SDG_v1.txt [02:48:01] Initializing [02:48:08] Running [02:48:08] EvaluateFitnessOfStartingGeneSignatures 28638 Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.28_windows_x86_64 -SettingsFile MCM1_0001860_6118.txt -DatabaseFile dataset-17_72_SDG_v1.txt [03:11:52] Initializing [03:11:58] Running [03:11:59] EvaluateFitnessOfStartingGeneSignatures 28638 [07:48:43] Writing final output [07:48:44] Closing Output Stream [07:48:44] Cleaning up Result.out = 5659331.000000 Run complete, CPU time: 21763.715495 07:48:44 (11872): called boinc_finish </stderr_txt> ]]> There are more than 10 of such kind of PVAR WUs. In my understanding, there used be the restart issue though, It should have been resolved but now. Is this a new monster? [Edit 1 times, last edit by Former Member at Feb 4, 2014 12:23:10 PM] |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
Yes, also seeing it here. I have gotten some more invalids, but as of yet I can see no pattern. After a period of very good stability, it seems some issues are resurfacing. Could be the restart issue resurfacing, maybe the techs will give it another look.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've had one recent Invalid after a restart here, too. My WU was MCM1_ 0001888_ 9872_ 0-- and gave Result.out = 1688227.000000. The _1 and _2 Valid WUs both gave Result.out = 1688332.000000.
|
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
We are looking into the issues and will report back once we know more. Based on reports in the forums it does look like an issue with restoring from a checkpoint for a certain type of workunits we are running now. A temporary workaround until we have a permanent solution in place would be to turn on the setting to leave the applicaiton in memory when suspended to reduce the likelihood of a restart.
Thanks, armstrdj |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks for your report, Joe and Tony.
The WUs still left PVER now are : MCM1_ 0001894_ 1326 MCM1_ 0001897_ 9670 MCM1_ 0001897_ 8046 MCM1_ 0001897_ 6976 MCM1_ 0001886_ 5902 MCM1_ 0001874_ 4840 MCM1_ 0001874_ 3642 Hope this information help the tech. Kiyo. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kiyo, are you aware that PVer (Pending Verification) by itself isn't necessarily implying that there is an problem with your result? Where 2 workunits are sent out for a single task, it does probably mean that one of the results has a problem, but the other workunit will also have the status of PVer until an extra (third) workunit is returned, at which point the results will turn Valid or Invalid. Hence, it's the final status (Valid or Invalid) that really determines whether your workunit had a problem or not.
The FAQ entitled Work Unit progress state descriptions gives full details. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Tony, I have raised my hand to report the restart issue seems to come out again but not the PVER it self. I know what you are saying. Actually, most of PVERs are validiated within a couple of days and turn to 'Valid' status.
I have 3 MCM1 and 3 FAVH results marked PVER and no invalids right now. With 5 instances of BOINC running, it is rather small number of PVERs and it is not a issue it self. If my report was missleading, I will say sorry for my bad English ability. |
||
|
|
|