Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Mapping Cancer Markers Forum Thread: Explanation of an error unit |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 4
|
Author |
|
Martin Schnellinger
Advanced Cruncher Joined: Apr 29, 2007 Post Count: 123 Status: Offline Project Badges: |
Hello friends,
Workunit MCM1_0196328_9192 retorned as an error for me. My result log is; <core_client_version>7.2.47</core_client_version> <![CDATA[ <message> finish file present too long </message> <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_map_7.61_windows_x86_64 -SettingsFile MCM1_0196328_9192.txt -DatabaseFile dataset-sarc1.txt Settings File DateOfDesign = 20200218 Designer = Krembil/cubes WorkOrderID = 0196328_9192 DatasetID = sarc1 RSeed = 360009193 StartingGeneSignatureAlgorithm = randomFixedLengthSearch RunPermutationAlgorithm = 0 FitnessFn = 0 NumberOfGenesInStartingSignature = 20 NumberOfGenesInSignatureMin = 20 NumberOfGenesInSignatureMax = 20 SearchAlgorithmNumberToCreate = 12071 MinFitness = 0.497 VMethod = NFCV NFolds = 20 SvmArgs = "-v 0 -t 0 -c 1000" SvmLearnLimit = 250000 [02:59:40] Initializing [03:00:13] Running [03:00:13] EvaluateFitnessOfStartingGeneSignatures 12071 [20:08:59] Writing final output [20:08:59] Closing Output Stream [20:08:59] Cleaning up Result.out = 27420.000000 Run complete, CPU time: 6559.779650 20:09:14 (9056): called boinc_finish(0) </stderr_txt> ]]> My wingman has the result as Pending validation his result log being <core_client_version>7.20.2</core_client_version> <![CDATA[ <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_map_7.61_windows_x86_64 -SettingsFile MCM1_0196328_9192.txt -DatabaseFile dataset-sarc1.txt Settings File DateOfDesign = 20200218 Designer = Krembil/cubes WorkOrderID = 0196328_9192 DatasetID = sarc1 RSeed = 360009193 StartingGeneSignatureAlgorithm = randomFixedLengthSearch RunPermutationAlgorithm = 0 FitnessFn = 0 NumberOfGenesInStartingSignature = 20 NumberOfGenesInSignatureMin = 20 NumberOfGenesInSignatureMax = 20 SearchAlgorithmNumberToCreate = 12071 MinFitness = 0.497 VMethod = NFCV NFolds = 20 SvmArgs = "-v 0 -t 0 -c 1000" SvmLearnLimit = 250000 [01:18:33] Initializing [01:18:38] Running [01:18:38] EvaluateFitnessOfStartingGeneSignatures 12071 [02:03:25] Writing final output [02:03:25] Closing Output Stream [02:03:25] Cleaning up Result.out = 27420.000000 Run complete, CPU time: 101.406250 02:03:25 (17644): called boinc_finish(0) </stderr_txt> ]]> The result out in both cases is identical: If the result out is the same, both should be counted identicalle It is true, the calculation took very very long in my case, too long maybe? Can anyone explain please Thank you very much Good bye |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2084 Status: Offline Project Badges: |
Martin Schnellinger, this is your problem:
----------------------------------------<message> finish file present too long </message> As to why that happened, is another story. It's a timing error, BOINC timing out waiting for results, not a problem with the actual results. From a message posted by Richard Haselgrove: "finish file present too long" is a 10 second limit hard coded into the BOINC client: when the science is all over and done, the science application should write a file to say that it's completed and BOINC can clean up and report it: then the science application should shut itself down and get out of the way so that BOINC can move on to the next job. UPDATE: I've found an interesting addition to this, from post 530714, posted by Rickjb, about system responsiveness: I tried searching the WCG forum for "Finish file too long" I was interpreting "file too long" as meaning that there were extra data being scribbled onto some WU result file, but Richard Hasselgrove's post at https://boinc.berkeley.edu/dev/forum_thread.php?id=10354&postid=62717 gave me the clue: Rickjb finishes with a practical case:"I think the "too long" message refers to any delay between signalling that the app has finished, and the process finally quitting." For some months I've been running the machine that's now getting errors under Linux x64 from a 16GB USB stick. Until recently, it was a high-speed USB3.0 stick in a USB3.0 port, and system responsiveness including the Xfce GUI was very good. There were no errors crunching WCG. (FFPTL = finish file present too long)Then BOINC crashed and would not restart, and I could not reboot Linux. The USB stick has set itself to read-only and it cannot be mounted under either Linux or Windows. I was able to dd the entire 16GB device image to a new USB2.0 stick without any read errors. On another machine I ran fsck -f -y on the newly-copied Linux partition, put the USB2.0 stick into the first machine, booted it and continued. WUs that were in progress restarted and continued successfully. System responsiveness is pretty terrible, and it's generating FFPTL errors, which I guess are happening when BOINC activities are forced to wait for higher-priority Linux filesystem activities. Also, it happened many years ago to me, too, when my system was getting sluggish and unresponsive, BOINC would generate the error finish file present too long during the last moments of a task (at 100%). Adri [Edit 3 times, last edit by adriverhoef at Feb 14, 2023 11:09:22 AM] |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7578 Status: Offline Project Badges: |
I concur with Rickjb's findings. I, too, run several systems with 16gb USB drives and usually have not had any problems. However, a couple of times there have been power glitches of one sort or another which have toasted the drives and caused them to be turned into read only bricks.(Not the cause of the error here in question.) The "finish file too long" errors have come in conjunction with systems which became sluggish for unknown reasons, possibly non fatal degradation of the USB drives. Replacing the drive cured that problem. I have suspected this error could also have been caused by an internet connection which had become a bit flaky, but I have no real basis for this other than speculation. The thought was the file was existing for a time period which was too long between the time it was written and the time it should have been transmitted by BOINC.
----------------------------------------The good part is this error has not recurred for a considerable length of time. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 869 Status: Offline Project Badges: |
The latest clients allow a longer time for clean-up; I've not had this error in ages, but sometimes I see a wingman with a recent client returning an Error with
Process still present 5 min after writing finish file. in the error log. I don't see as many of these as I used to see of the shorter duration ones...I note that Martin Schnellinger is using client 7.2.47; if there's no good reason he can't upgrade, I'd suggest that he gets something newer :-) -- if he can't then it's likely to happen again... Cheers - Al. |
||
|
|