| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 19
|
|
| Author |
|
|
johnlod
Advanced Cruncher Ireland Joined: May 1, 2007 Post Count: 50 Status: Offline Project Badges:
|
Hi All
----------------------------------------My PC was working on a Human Proteome Folding Phase-2 task for only a few hours (normally they take over 20 hours on my PC) when it crashed out. There was still many days before the required finish date. I have copied the relevant lines from the message file below. In addition there was a Windows message saying that the task had requested a terminate task command from Windows and that I was to contact those connected with the computing task that had crashed. My PC quickly switched to a Help Conquer Cancer task and seems to be working OK. Any comment as to why the HPFP task crashed would be appreciated. John 26/07/2010 05:42:51|World Community Grid|Starting nq014_00060_16 26/07/2010 05:42:51|World Community Grid|[cpu_sched] Starting nq014_00060_16 (initial) 26/07/2010 05:42:53|World Community Grid|Starting task nq014_00060_16 using hpf2 version 617 26/07/2010 05:42:54|World Community Grid|Started upload of X0000032670134200405281113_1_0 26/07/2010 05:43:00|World Community Grid|Finished upload of X0000032670134200405281113_1_0 26/07/2010 10:10:03|World Community Grid|Computation for task nq014_00060_16 finished 26/07/2010 10:10:03|World Community Grid|Output file nq014_00060_16_0 for task nq014_00060_16 absent 26/07/2010 10:11:07|World Community Grid|Sending scheduler request: To fetch work. Requesting 47520 seconds of work, reporting 2 completed tasks 26/07/2010 10:11:15|World Community Grid|Scheduler request succeeded: got 1 new tasks ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
What version of windows? what are your virtual memory settings? what were you doing when it crashed?
|
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
johnlod,
----------------------------------------Suspect the infamous /711 about which we can't do anything but to make sure that BOINC has enough workspace to maneuver ... HPF2 does this every so in-often, totally random and more frequently when an AutoDock based science runs in parallel (FAAH/HFCC). On my laptop about every 1 in 50 HPF2 tasks goes out and usually within the first few minutes... not a great loss. Visit My Grid > Result Status and click on the status column link, error probably, of the nq014_00060_16_0 for task nq014_00060_16 task to bring up the Result log.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
johnlod
Advanced Cruncher Ireland Joined: May 1, 2007 Post Count: 50 Status: Offline Project Badges:
|
It's Windows XP SP3.
----------------------------------------The Virtual memory is system managed and is currently set at 1055MB. What I noticing now is that a Conquer Cancer work task is executing but it is doing so very slowly. It is scheduled to take about 15 hours of CPU time and these normally take around 8. It is an Intel Celeron R 1.99GHz and has 704 MB of RAM. It is quite an old PC. I wasn't doing anything special when it crashed. I wasn't even using the PC. The error log showed the following: Result Name: nq014_ 00060_ 16-- <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> </stderr_txt> ]]> ![]() [Edit 1 times, last edit by johnlod at Jul 27, 2010 10:11:26 AM] |
||
|
|
johnlod
Advanced Cruncher Ireland Joined: May 1, 2007 Post Count: 50 Status: Offline Project Badges:
|
Hi again
----------------------------------------The same thing has happened on the next folding task it tried to execute. A Conquer Cancer tasks ran OK. The error message on the PC was unusual. "Microsoft Users C++ Runtime Library This application has requested Runtime to terminate it in an unusual way. Please contact the application support team for more info" The application failed after 11.91 hours. The error from the results tool is: Result Name: nq133_ 00024_ 0-- <core_client_version>6.2.28</core_client_version> <![CDATA[ <message> The system cannot find the path specified. (0x3) - exit code 3 (0x3) </message> <stderr_txt> </stderr_txt> ]]> I have also been running XMAPP for Windows lately and this includes Apache, MySQL and Filezilla. That is the only difference between what I do with this PC now that the failures are occuring and before when there was never a failure. ![]() [Edit 1 times, last edit by johnlod at Jul 28, 2010 8:42:46 AM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
HPF2 is kind of timing sensitive maybe. On the device where I'm running HPF2 in a mix and the occasional /711 at 0.01 hours fail, I've now used Process Lasso tool to auto set the priority 1 up from idle to "below normal" to see if less of these occur, or none. On Linux, I've never seen or heard of the error occurring, be it the /401 or the /711.
----------------------------------------For me on Vista it went suddenly away when upgrading to the 6.6 client and it never or rarely reappeared when running the 6.10 client... WCG(IBM) is currently preparing to endorse this client (with the necessary website profile changes needed to go along).
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have seen this error when windows tried to resize VM and hit upper limit. was on XP with 512MB ram, default setting was locked at 1.5 to 3x ram, changed to sys managed and have not had problem again. are you sure is set to sys managed? I had thought mine was, but now know on xp home, default is 1.5 to 3x ram and not sys managed, I have verified on at least 3 differant installs
----------------------------------------http://www.worldcommunitygrid.org/forums/wcg/...ead,28555_offset,0#268407 [Edit 1 times, last edit by Former Member at Jul 28, 2010 10:45:16 AM] |
||
|
|
johnlod
Advanced Cruncher Ireland Joined: May 1, 2007 Post Count: 50 Status: Offline Project Badges:
|
The swap size is sys managed but I notice that the system has increased it to 2065MB from the 1055 (or something similiar) it was yesterday.
----------------------------------------Hopefully it will be robust enough now. With regard to updating the client, does this happen automatically or do we have to do it manually. ![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Linux 10.04 Ubuntu LTS or Debian are privileged. The 6.10.58 deb files have been sitting in the update manager now for several days. In Windows you only get a notification in the cleint msg log and when using the clean 6.2.28 skinned kit of WCG you only get it when WCG is happy with the new release... so you might want to wait for that.
----------------------------------------VM/Swap I've long set to 1.5 times RAM as minimum and unlimited. Sits on it's own partition to prevent it from fragmenting, which smurk, Linux Lucid does automatically, problem older version too.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
nasher
Veteran Cruncher USA Joined: Dec 2, 2005 Post Count: 1423 Status: Offline Project Badges:
|
in the past 2 weeks i have had about 15 HPF-2's error out all of them seem to be on my system running vista all erros have been 0.01 hour errors so i didnt worry about them really since i know the computer produces good work most the time
----------------------------------------IntelPentium(R) Dual-Core CPU E5300 @ 2.60GHz (all i can get while at work) ![]() |
||
|
|
|