| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 57
|
|
| Author |
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
The link to your WU will not work. Please post a copy of the stderr.txt file.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
cornel
Cruncher Joined: Jan 29, 2009 Post Count: 4 Status: Offline Project Badges:
|
Here it is:
<core_client_version>7.2.42</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.36_x86_64-pc-linux-gnu -SettingsFile MCM1_0122764_6720.txt -DatabaseFile dataset-curatedOvarian_EarlyLate_v1.0.txt Settings File DateOfDesign = 08/05/2014 Designer = PMCC_OCI_0.1 WorkOrderID = 0122764_6720 DatasetID = curatedOvarian_EarlyLate_v1.0 NumberOfGenesInStartingSignature = 30 NumberOfGenesInSignatureMin = 30 NumberOfGenesInSignatureMax = 30 GroupVectorValues = {A}{B}{C}{D}{E}{F} ExplicitStartingGeneSignatures = A B D F StartingGeneSignatureAlgorithm = randomFixedLengthSearch SearchAlgorithmNumberToCreate = 60 SearchAlgorithmSequentialStartPosition = 5 RunPermutationAlgorithm = 0 PermutationGroups = A PermutationGroupsForReplacement = G PermutationAlgorithm = replaceFromRandomlyToRandomlyGreedy PermutationsNumIterations = 0 OptimizationAlgorithmFrequency = 0 0 1 FBeta = 1.5 SimAnnealIMax = 20000 SimAnnealAlpha = 0.9996 FitnessFn = 0 MinFitness = -1.0 NReps = 10 TrainFrac = 0.7 NFolds = 10 VMethod = LOO ModelType = SVM SvmArgs = "-v 0 -c 0.1 -t 1 -d 2 -r 0" SvmLearnLimit = 500000 RSeed = 184056721 [15:52:08] Initializing [15:52:12] Running [15:52:12] EvaluateFitnessOfStartingGeneSignatures 60 SIGSEGV: segmentation violation Stack trace (17 frames): [0x498bcd] [0x480080] [0x410a7b] [0x415017] [0x4188eb] [0x41ca30] [0x474200] [0x46c267] [0x46bf8a] [0x4681c7] [0x468ac3] [0x43f91d] [0x442fd2] [0x4430b5] [0x4258d6] [0x5174ab] [0x400449] Exiting... </stderr_txt> ]]> |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
process exited with code 193 (0xc1, -63) SIGSEGV: segmentation violation These indicate some type of memory violation. See:this Advice is to "Use a memory checking program like memtest86+ to rigorously test your memory. " Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
cornel
Cruncher Joined: Jan 29, 2009 Post Count: 4 Status: Offline Project Badges:
|
Thanks for the advice!
I did 1 full pass with Memtest86 5.01, with no errors. I will leave the machine on until tomorrow, see if any RAM errors pop out. Anyone having problems with this project on Linux? It would be really strange that hardware faults affect only MCM, not other projects on the same machine. If there is any way to set a specific computer to exclude one project (MCM in this case), I would be thankful to hear it. Regards, Corneliu |
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
@cornel: "If there is any way to set a specific computer to exclude one project (MCM in this case), I would be thankful to hear it."
You can do this using your Device Manager and Device Profiles in your WCG website account, under "Settings" at top of pages. Re your possible hardware problem: Memtest-86 used to be a wimpy test, but later versions such as 5.06 on current Debian Live CD's/USB sticks have a use-all-cores option that's pretty ferocious on a hyperthreading quad as it runs 8 threads simultaneously. Maybe not so severe on your AMD, but give it a go. Also, set running a utility that displays your CPU temperature and then run Intel Burn Test (IBT) with "custom" setting to use as much memory as you can. If it passes an hour of so of IBT, you might also try some stress testing with Prime95 . Hey, AMD CPUs may be slow, but at least they give the right answers ![]() |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
If you read the last post by armstrdj, it seems there's a differing opinion on what some AMD do output. :p |
||
|
|
cornel
Cruncher Joined: Jan 29, 2009 Post Count: 4 Status: Offline Project Badges:
|
Thanks for the advice!
I could not run Intel burn test, but got a working USB stick loaded with Prime95. On the XenCenter console, with all VMs shut down and the burn test running, I got this not so nice line: [Thu May 5 20:02:09 2016] Self-test 12K passed! FATAL ERROR: Resulting sum was 1.908481569586742e+30, expected: 1.908481569586857e+30 Hardware failure detected, consult stress.txt file. Self-test 768K passed! [Thu May 5 20:09:26 2016] I changed the device profile as advised and I'll monitor the UGM, OET and FAAH workunits. If they turn out to be invalid, it looks like 8 years is all I got out of this rig... Best regards, Corneliu |
||
|
|
|