| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 22
|
|
| Author |
|
|
Gmansell
Cruncher Joined: Dec 31, 2008 Post Count: 5 Status: Offline Project Badges:
|
I had to abort 2 phase 2 batches that had computational error. Xp sp4. Any suggestions?
Thanks |
||
|
|
astroWX
Advanced Cruncher USA Joined: Sep 1, 2007 Post Count: 56 Status: Offline Project Badges:
|
What did you see, and where, that caused you to determine that you "had to abort 2 phase 2 batches" -- and, if they had computational errors, why did you have to abort them? They had errors and did not terminate on their own? We need more information ...
|
||
|
|
Gmansell
Cruncher Joined: Dec 31, 2008 Post Count: 5 Status: Offline Project Badges:
|
Computational error in Status column of BOINC manager. App was not using cycles and I aborted. Will get a screen shot next occurrence. It did not self terminate.
|
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
... "It did not self-terminate" ...
WUs that show Computational Error in the Status column are usually reported to WCG automatically when BOINC next fetches new WUs, so you should not have to abort them manually from BOINC Manager. The error log in your WCG website Results Status might be lost if you abort them (I forget, so you might take a look). OTOH, if a WU has hung, ie is not showing Computational Error but is not increasing % Complete, it should still be showing as a task in the Operating System's task list (Task Manager in Windows, ps in Linux). You might need to abort these, but be aware that some WCG projects only update their % Complete value infrequently. Very occasionally I have had a zombie WU task that shows in Task Manager but does not show in BOINC Manager, and I've had to kill it from Task Manager. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Dear Gmansell,
Rickjb is right - you generally should not and don't have to terminate anything yourself. BOINC does all this error handling for you. Best wishes from Your Harvard CEP team |
||
|
|
madd1n
Cruncher Joined: Feb 1, 2006 Post Count: 7 Status: Offline Project Badges:
|
I get this frequently, too, and I can provide a pattern. It has frustrated me enough that I opted out of the project. This always happens when I shut down my computer while a Clean Energy Project work unit is being computed. The progress apparently isn't saved properly, and when I boot my machine again it always says "Computational Error", shows a progress of 100% and the runtime elapsed until then and then aborts itself. Hibernation kills it the same way. Since I never keep my machine running for days at a time, it's useless for me to try and work on this project - the units never complete before I reboot. All other projects save their progress on system shutdown and resume when I reboot.
----------------------------------------[Edit 1 times, last edit by madd1n at May 7, 2012 6:46:27 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi and welcome to the forums,
Your description sounded familiar and would have had a fix, but the hibernation method failing kind of closed that solution. Still, if you post the startup message log of the client (first 30 lines), we may see something that would allow some specific advise. Particular the OS and client version is of interest and how installed which the startup log will tell and more of interest. The manual work around is to exit the BOINC service, which ensures all files are saved prior to the busy shutdown procedure causing BOINC to hang and the OS prematurely forcing a close. Same during system startup. For that there's a setting to force BOINC to not start immediately... a delay option. Mine is set to 60 seconds with <start_delay>60</start_delay> Various described as part of the extended Vista FAQ in the Start Here forum. Let us know. --//-- |
||
|
|
LAZA74
Advanced Cruncher Germany Joined: Sep 28, 2008 Post Count: 56 Status: Offline Project Badges:
|
Hai all!
----------------------------------------I got the same problems since 4th of May, only with TCEP-P2: Message at Start: http://pastebin.com/L3Aqf4RM Failure: http://pastebin.com/7837c5Q0 I hope, i got all the interesting messages. For me, it looks like the WUs form this project are defective (or something like that). If there are more infos needed, let me know. Thanks in advance LAZA
NAS - Eigenbau
Xiaomi Mi 10T |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sorry LAZA74, but see no errors/failures/heartbeat issue/zero status/Signal 11/15 in your failure client message log. If the Result Status page list the CEP2 results as ended in "Error" (which is what counts), then please click those status links and post the Result Logs.
6.12.33 is what I run on the Linux 11.10 (which is exactly what you're running) on dual boot quad, with the mentioned 40% Non BOINC load pause, to stop the heartbeat / Signal 11-15 occurrences, meaning BOINC pauses when the system is incurring general normal/high priority [non-BOINC] CPU loads of greater than 40%. --//-- |
||
|
|
LAZA74
Advanced Cruncher Germany Joined: Sep 28, 2008 Post Count: 56 Status: Offline Project Badges:
|
Okay...
----------------------------------------I got errors in this WUs: E207542_ 956_ C.27.C24H14OS2.01879920.0.set1d06_ 0-- ubuntu Error 07.05.12 06:50:33 07.05.12 06:52:57 0.00 0.1 / 0.0 E207542_ 943_ C.26.C20H8S4Se2.01760109.4.set1d06_ 0-- ubuntu Error 07.05.12 06:48:38 07.05.12 06:50:33 0.00 0.1 / 0.0 E207542_ 790_ C.26.C22H14N2SeSi.01824233.2.set1d06_ 1-- ubuntu Error 07.05.12 06:30:55 07.05.12 06:48:38 0.00 0.1 / 0.0 E207538_ 778_ C.27.C23H14N2S2.01862558.3.set1d06_ 1-- ubuntu Error 07.05.12 03:09:43 07.05.12 06:30:55 0.00 0.1 / 0.0 E207539_ 254_ C.26.C22H14N2SeSi.01712951.4.set1d06_ 0-- ubuntu Error 07.05.12 03:03:10 07.05.12 03:06:04 0.01 0.4 / 0.0 E207536_ 423_ C.28.C22H12N4OS.01457647.0.set1d06_ 0-- ubuntu Error 06.05.12 21:44:36 06.05.12 22:56:34 0.00 0.2 / 0.0 E207533_ 933_ C.28.C23H12N2O2S.01470833.2.set1d06_ 0-- ubuntu Error 06.05.12 21:42:13 06.05.12 21:44:36 0.00 0.2 / 0.0 E207534_ 089_ C.26.C22H14N2SeSi.01315785.2.set1d06_ 1-- ubuntu Error 06.05.12 21:38:00 06.05.12 21:42:13 0.00 0.2 / 0.0 E207534_ 692_ C.27.C22H12N2O2Se.01223633.3.set1d06_ 1-- ubuntu Error 06.05.12 20:29:18 06.05.12 21:38:00 0.00 0.2 / 0.0 E207528_ 813_ C.28.C22H12N4OS.01546308.1.set1d06_ 0-- ubuntu Error 06.05.12 12:04:19 06.05.12 20:29:18 0.00 0.2 / 0.0 E207524_ 502_ C.27.C22H14N2OSSi.00849728.0.set1d06_ 1-- ubuntu Error 06.05.12 06:33:27 06.05.12 12:04:19 0.00 0.1 / 0.0 E207523_ 710_ C.28.C22H12N4OS.00857208.1.set1d06_ 1-- ubuntu Error 06.05.12 06:31:57 06.05.12 06:33:25 0.02 0.5 / 0.0 E207525_ 052_ C.26.C21H16N2SSi2.00888973.4.set1d06_ 0-- ubuntu Error 06.05.12 06:30:12 06.05.12 06:31:56 0.00 0.1 / 0.0 E207524_ 247_ C.27.C24H14OS2.01006859.0.set1d06_ 1-- ubuntu Error 06.05.12 06:29:34 06.05.12 06:30:12 0.01 0.2 / 0.0 E207524_ 920_ C.26.C18H8N2S3Se2Si.00935754.0.set1d06_ 0-- ubuntu Error 06.05.12 06:27:49 06.05.12 06:29:34 0.00 0.1 / 0.0
NAS - Eigenbau
Xiaomi Mi 10T |
||
|
|
|