Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Retired Forums Forum: UD Windows Agent Support [Read Only] Thread: Unrecoverable error |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 26
|
Author |
|
nhoeller
Cruncher Joined: Nov 24, 2004 Post Count: 10 Status: Offline Project Badges: |
Kevin, thanks for the fast response! Will you post something to http://www.worldcommunitygrid.org/forums/wcg/listthreads?forum=2 when the fix is available, and also how to get the fix? For the moment, I have suspended WCG processing on Windows PCs since less than 50% appear to be completing successfully (I have a fair number of workunits downloaded).
|
||
|
Halifax--lad
Advanced Cruncher Joined: Nov 1, 2005 Post Count: 77 Status: Offline |
Kevin, thanks for the fast response! Will you post something to http://www.worldcommunitygrid.org/forums/wcg/listthreads?forum=2 when the fix is available, and also how to get the fix? For the moment, I have suspended WCG processing on Windows PCs since less than 50% appear to be completing successfully (I have a fair number of workunits downloaded). you don't do anything to get the fix when it is done it will be included within the WU's that get sent out to stop them erroring out |
||
|
madmac
Advanced Cruncher England Joined: Dec 4, 2005 Post Count: 104 Status: Offline Project Badges: |
I too have had two computing errors on four downloads. I am using 5.2.6 so I will need to upgrade to see if this sorts out problem/
---------------------------------------- |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Worth a try. Several members have had problems that disappeared with 5.2.13.
|
||
|
nhoeller
Cruncher Joined: Nov 24, 2004 Post Count: 10 Status: Offline Project Badges: |
Since the update to rosetta v422, WCG has been running clean on my Windows PCs, until this morning. Here is the log from the failing system (WinXP SP2, 192MB RAM):
15/12/2005 12:03:28 AM|World Community Grid|Pausing result eb364_18_1 (removed from memory) 15/12/2005 12:03:44 AM|SETI@home|Starting result 30se04ab.19198.15904.22154.46_0 using setiathome version 418 15/12/2005 12:03:44 AM||request_reschedule_cpus: process exited 15/12/2005 12:03:46 AM|SETI@home|Finished download of 28se04ab.13698.2961.947138.109 15/12/2005 12:03:46 AM|SETI@home|Throughput 1816 bytes/sec 15/12/2005 12:03:47 AM||request_reschedule_cpus: files downloaded 15/12/2005 1:04:13 AM|World Community Grid|Restarting result eb364_18_1 using rosetta version 422 15/12/2005 1:04:13 AM|SETI@home|Pausing result 30se04ab.19198.15904.22154.46_0 (removed from memory) 15/12/2005 1:04:18 AM||request_reschedule_cpus: process exited 15/12/2005 3:04:19 AM|World Community Grid|Pausing result eb364_18_1 (removed from memory) 15/12/2005 3:04:20 AM|SETI@home|Restarting result 30se04ab.19198.15904.22154.46_0 using setiathome version 418 15/12/2005 3:04:22 AM|World Community Grid|Unrecoverable error for result eb364_18_1 ( - exit code -1073741819 (0xc0000005)) 15/12/2005 3:04:23 AM||request_reschedule_cpus: process exited 15/12/2005 3:04:23 AM|World Community Grid|Computation for result eb364_18_1 finished 15/12/2005 3:04:25 AM|World Community Grid|Started upload of eb364_18_1_0 |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello nhoeller,
I am puzzled. Trying to think of an OS type problem that would only show sporadically makes me wonder about the size of your virtual memory in Windows, since the FightAIDS@Home program needs 300 MB available to suspend successfully. So far, that is the only idea that occurs to me. |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
nhoeller,
We released a fix for the '-1073741819' error that has cut the number of these errors in half but we have identified a second problem that we are investigating now. This one is more subtle and we are going to have release rosetta 4.23 that will give us more information to track down the problem. If you see this again - can you post the information from your log like you did? BOINC will automatically return the stderr file with the result, but it doesn't give us the information that you posted above so that is helpful. thanks - and our apologies for the problems. Kevin |
||
|
nhoeller
Cruncher Joined: Nov 24, 2004 Post Count: 10 Status: Offline Project Badges: |
nhoeller, We released a fix for the '-1073741819' error that has cut the number of these errors in half but we have identified a second problem that we are investigating now. This one is more subtle and we are going to have release rosetta 4.23 that will give us more information to track down the problem. If you see this again - can you post the information from your log like you did? BOINC will automatically return the stderr file with the result, but it doesn't give us the information that you posted above so that is helpful. thanks - and our apologies for the problems. Kevin Kevin, here is an error from today, from a WinXP SP2 machine with 512MB RAM: WinXP SP2, 512MB RAM 22/12/2005 7:23:06 AM|climateprediction.net|Pausing result sulphur_gmj8_000775700_0 (removed from memory) 22/12/2005 7:23:07 AM|World Community Grid|Restarting result eb690_06_4 using rosetta version 422 22/12/2005 7:23:10 AM||request_reschedule_cpus: process exited 22/12/2005 8:23:12 AM|climateprediction.net|Restarting result sulphur_gmj8_000775700_0 using sulphur_cycle version 422 22/12/2005 8:23:12 AM|World Community Grid|Pausing result eb690_06_4 (removed from memory) 22/12/2005 8:23:13 AM||request_reschedule_cpus: process exited 22/12/2005 11:23:14 AM|climateprediction.net|Pausing result sulphur_gmj8_000775700_0 (removed from memory) 22/12/2005 11:23:15 AM|World Community Grid|Restarting result eb690_06_4 using rosetta version 422 22/12/2005 11:23:19 AM||request_reschedule_cpus: process exited 22/12/2005 12:23:20 PM|climateprediction.net|Restarting result sulphur_gmj8_000775700_0 using sulphur_cycle version 422 22/12/2005 12:23:20 PM|World Community Grid|Pausing result eb690_06_4 (removed from memory) 22/12/2005 12:23:23 PM|World Community Grid|Unrecoverable error for result eb690_06_4 ( - exit code -1073741819 (0xc0000005)) I will try to extract the error messages from other failures. Are there specific pieces of information you are looking for? Boincview has a log of all the messages, but I haven't found a way to copy them. Regards, Norbert |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
That information that you have provided is the information that is helpful. The logs we get back do not tell us what BOINC was doing with the science application at the time. What your log file tells me is that BOINC was 'pausing' our application when the error occurred. I would also like to know if you see this on any other BOINC projects (we have found discussions about this occuring elsewhere).
Also - if you are running any security software (any sort of firewalls, anti-virus, etc) if you don't mind please list those. Those could be getting involved - if we start a list then we can possibly identifying a trend. Finally - if you can list how often you are personally seeing the error. And if you are encountering it often I would like you to try altering your BOINC device profile (you can access your default one here: http://www.worldcommunitygrid.org/ms/device/v...iguration.do?name=Default) and change the value "Switch between applications every:" to something 3-4 times higher (this value determines how often it will switch which project is running. There seems to be a trend from the logs above that this error occurs when the science application is suspended and if it is suspended less often that should reduce the number of errors (this is a workaround) |
||
|
nhoeller
Cruncher Joined: Nov 24, 2004 Post Count: 10 Status: Offline Project Badges: |
I am seeing WCG failures on all the Window machines. I am not seeing problems with SETI or Climateprediction. As far as security software is concerned:
* WinXP SP2 512MB Norton SystemWorks 2005, ZoneAlarm Pro * WinXP SP2 192MB Norton SystemWorks 2005, ZoneAlarm Pro * Win98 192MB, Norton AntiVirus 2005, ZoneAlarm (free) I run SpyBot on all machines, but only the 512MB has TeaTimer enabled. All of them are using the SpyBot immunization feature. I just got another failure on the WinXP SP2 192MB machine: 22/12/2005 10:04:42 PM|World Community Grid|Computer ID: 6670; location: Default; project prefs: default 22/12/2005 10:04:42 PM|SETI@home|Computer ID: 1879546; location: ; project prefs: default 22/12/2005 10:04:42 PM||General prefs: from unknown project http://climateprediction.net/ (last modified 2005-12-16 13:57:46) 22/12/2005 10:04:42 PM||General prefs: using your defaults 22/12/2005 10:04:42 PM||Remote control allowed 22/12/2005 10:04:42 PM|World Community Grid|Resuming computation for result eb826_15_0 using rosetta version 422 22/12/2005 10:04:44 PM|SETI@home|Deferring computation for result 13fe05aa.24442.23698.386086.32_0 22/12/2005 10:04:44 PM||Suspending network activity - time of day 22/12/2005 10:05:37 PM|World Community Grid|Result eb826_15_0 exited with zero status but no 'finished' file 22/12/2005 10:05:37 PM|World Community Grid|If this happens repeatedly you may need to reset the project. 22/12/2005 10:05:37 PM||request_reschedule_cpus: process exited 22/12/2005 10:05:38 PM|World Community Grid|Restarting result eb826_15_0 using rosetta version 422 22/12/2005 10:23:47 PM||Suspending computation and network activity - user request 22/12/2005 10:23:48 PM|World Community Grid|Pausing result eb826_15_0 (removed from memory) 22/12/2005 10:23:57 PM|World Community Grid|Unrecoverable error for result eb826_15_0 ( - exit code -1073741819 (0xc0000005)) 22/12/2005 10:24:00 PM||request_reschedule_cpus: process exited 22/12/2005 10:24:00 PM|World Community Grid|Computation for result eb826_15_0 finished I will extract all the failures and post tomorrow. I have already increased the cycle timer to 120 minutes - I can increase it some more. Regards, Norbert |
||
|
|