Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 5
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1389 times and has 4 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
shock Error after 55 hours of crunching

Has anyone a clue, what might have happend here...??? The PC is a single core...

E000027_ 351A_ 00035s00f_ 1-- Error 14.12.08 00:34:08 17.12.08 01:12:32 55.27 660.9 / 0.0

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
Maximum CPU time exceeded
</message>
<stderr_txt>
Calling initGraphics()
INFO: No state to restore. Start from the beginning.
Calling initGraphics()


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7C93A3E1

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.3


Dump Timestamp : 12/16/08 21:43:18
Install Directory : C:\Programme\BOINC\
Data Directory : C:\Programme\BOINC
Project Symstore :
Loaded Library : C:\Programme\BOINC\\dbghelp.dll
Loaded Library : C:\Programme\BOINC\\symsrv.dll
Loaded Library : C:\Programme\BOINC\\srcsrv.dll
LoadLibraryA( C:\Programme\BOINC\\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:ProgrammeBOINCslots0;C:ProgrammeBOINCprojectswww.worldcommunitygrid.org;srv*C:DOKUME~1AD
MINI~1\LOKALE~1\Temp\symbols*http://msdl.microsoft.com/download/symbols;sr...inc.berkeley.edu/symstore


ModLoad: 00400000 0e752000 C:\Programme\BOINC\projects\www.worldcommunitygrid.org\wcgrid_cep1_6.19_windows_intelx86 (-nosymbols- Symbols Loaded)

ModLoad: 7c920000 000c6000 C:\WINDOWS\system32\ntdll.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.3959

ModLoad: 7c800000 00115000 C:\WINDOWS\system32\kernel32.dll (5.2.3790.4062) (-exported- Symbols Loaded)
File Version : 5.2.3790.4062 (srv03_sp2_gdr.070417-0203)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.4062

ModLoad: 77b60000 00008000 C:\WINDOWS\system32\VERSION.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.3959

ModLoad: 77b70000 0005a000 C:\WINDOWS\system32\msvcrt.dll (7.0.3790.3959) (-exported- Symbols Loaded)
File Version : 7.0.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 7.0.3790.3959

ModLoad: 77e20000 00092000 C:\WINDOWS\system32\USER32.dll (5.2.3790.4033) (-exported- Symbols Loaded)
File Version : 5.2.3790.4033 (srv03_sp2_gdr.070228-0030)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.4033

ModLoad: 77bd0000 00049000 C:\WINDOWS\system32\GDI32.dll (5.2.3790.4396) (-exported- Symbols Loaded)
File Version : 5.2.3790.4396 (srv03_sp2_gdr.081022-1212)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.4396

ModLoad: 77f30000 000ab000 C:\WINDOWS\system32\ADVAPI32.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.3959

ModLoad: 77c20000 0009f000 C:\WINDOWS\system32\RPCRT4.dll (5.2.3790.4115) (-exported- Symbols Loaded)
File Version : 5.2.3790.4115 (srv03_sp2_gdr.070709-2335)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.4115

ModLoad: 76e40000 00013000 C:\WINDOWS\system32\Secur32.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.3959

ModLoad: 76b00000 00028000 C:\WINDOWS\system32\imagehlp.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.3959

ModLoad: 76180000 0001d000 C:\WINDOWS\system32\IMM32.DLL (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.3959

ModLoad: 77790000 00021000 C:\WINDOWS\system32\NTMARTA.DLL (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.3959

ModLoad: 76e00000 0002f000 C:\WINDOWS\system32\WLDAP32.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.3959

ModLoad: 7e020000 0000f000 C:\WINDOWS\system32\SAMLIB.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 5.2.3790.3959

ModLoad: 774f0000 00139000 C:\WINDOWS\system32\ole32.dll (5.2.3790.3959) (-exported- Symbols Loaded)
File Version : 5.2.3790.3959 (srv03_sp2_rtm.070216-1710)
Company Name : Microsoft Corporation
Product Name : Betriebssystem Microsoft® Windows®
Product Version : 5.2.3790.3959

ModLoad: 10c80000 00115000 C:\Programme\BOINC\dbghelp.dll (6.6.7.5) (-exported- Symbols Loaded)
File Version : 6.6.0007.5 (debuggers(dbg).051021-1446)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.6.0007.5

ModLoad: 10ea0000 00083000 C:\Programme\BOINC\symsrv.dll (6.6.7.5) (-exported- Symbols Loaded)
File Version : 6.6.0007.5 (debuggers(dbg).051021-1446)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.6.0007.5

ModLoad: 10f30000 0003a000 C:\Programme\BOINC\srcsrv.dll (6.6.7.5) (-exported- Symbols Loaded)
File Version : 6.6.0007.5 (debuggers(dbg).051021-1446)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.6.0007.5



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 9653, Write: 0, Other 19429

- I/O Transfers Counters -
Read: 0, Write: 24983, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 499908, QuotaPeakPagedPoolUsage: 499932
QuotaNonPagedPoolUsage: 2576, QuotaPeakNonPagedPoolUsage: 2576

- Virtual Memory Usage -
VirtualSize: 377442304, PeakVirtualSize: 387862528

- Pagefile Usage -
PagefileUsage: 354652160, PeakPagefileUsage: 364675072

- Working Set Size -
WorkingSetSize: 77864960, PeakWorkingSetSize: 88780800, PageFaultCount: 1017007266

*** Dump of thread ID 7168 (state: Waiting): ***

- Information -
Status: Wait Reason: UserRequest, , Kernel Time: 0.000000, User Time: 312500.000000, Wait Time: 26014944.000000

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x7C93A3E1

- Registers -
eax=00000000 ebx=00000000 ecx=00b24166 edx=10b6613c esi=7c8024de edi=00000001
eip=7c93a3e1 esp=10b6fba0 ebp=10b6ffec
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246

- Callstack -
ChildEBP RetAddr Args to Child
10b6ffec 00000000 004359c0 00000000 00000000 000000c8 ntdll!DbgBreakPoint+0x0

*** Dump of thread ID 5608 (state: Ready): ***

- Information -
Status: Base Priority: Above Normal, Priority: Above Normal, , Kernel Time: 28279531520.000000, User Time: 1524582711296.000000, Wait Time: 26014946.000000

- Registers -
eax=12480670 ebx=000004c0 ecx=124809f0 edx=00000520 esi=12480000 edi=149016c0
eip=006c1867 esp=0f55d1a8 ebp=0f55d20c
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000283

- Callstack -
ChildEBP RetAddr Args to Child
0f55d20c 006c0fbc 11892a30 11892a38 003b0178 00b8bfc0 wcgrid_cep1_6!+0x0
0f55d244 7c94a0fc 00afbe10 003b0000 00000000 00afbe15 wcgrid_cep1_6!+0x0
0f55d248 00afbe10 003b0000 00000000 00afbe15 00000500 ntdll!RtlAllocateHeap+0x0
0f55d254 00afbe15 00000500 00000000 00c3cc48 c62b6b20 wcgrid_cep1_6!+0x0
00000000 00000000 00000000 00000000 00000000 00000000 wcgrid_cep1_6!+0x0

*** Dump of thread ID 5388 (state: Waiting): ***

- Information -
Status: Wait Reason: UserRequest, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 15810354.000000

- Registers -
eax=00affd95 ebx=00000114 ecx=00000000 edx=00000000 esi=00000114 edi=00000000
eip=7c9485ec esp=1236fee8 ebp=1236ff58
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246

- Callstack -
ChildEBP RetAddr Args to Child
1236ff58 7c821c8d 00000114 ffffffff 00000000 1236ff84 ntdll!KiFastSystemCallRet+0x0
1236ff6c 00abfd57 00000114 ffffffff 003b5d18 003b5d18 kernel32!WaitForSingleObject+0x0
1236ff84 00affe01 11770138 00000000 00000000 003b5d18 wcgrid_cep1_6!+0x0
1236ffb8 7c824829 003b5d18 00000000 00000000 003b5d18 wcgrid_cep1_6!+0x0
1236ffec 00000000 00affd95 003b5d18 00000000 144c0000 kernel32!GetModuleHandleA+0x0


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>
[Dec 17, 2008 9:28:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error after 55 hours of crunching

Each work unit has a time limit (in addition to the deadline). It is designed to stop a malformed task from running forever and wasting your time. The time limit set by WCG is 10 times the initial time estimate.

Since the time estimate is based on previous work from the same batch, I believe it is very likely that results which should have been marked invalid are pulling down the average completion time and distorting estimates. This affects the time limit, too.
[Dec 17, 2008 9:58:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Rickjb
Veteran Cruncher
Australia
Joined: Sep 17, 2006
Post Count: 666
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error after 55 hours of crunching

I was concerned about a WU that is currently running and estimated to take 40hrs, so I've checked that it will be OK ...

Exceeding CPU time limits occurred back in August 2008 when some extra-long FAAH WUs came out.
knreed posted a workaround on 4 Aug 08 at Re: Second really long work unit received
He said:
For those of you concerned about having your WU stopped because it runs too long, please look at the following to determine if you are at risk (and possibly prevent stoppage):
Open client_state.xml in a text editor (this file is located in your BOINC installation directory - or if you are using 6.2 then in something like C:\Documents and Settings\All Users\Application Data\BOINC).
Look in <host_info> for <p_fpops>. This is fpops value your computer got while running the BOINC benchmark.
Next look for the field <rsc_fpops_bound> within the <workunit> tag for one of these long running workunits.
The client will stop running the workunit after rsc_fpops_bound/p_fpops seconds goes by.
I haven't tested this, so I do not know if it will work. However, you should be able to stop BOINC (stop the client and the manager). Then open both client_state.xml and client_state_prev.xml and modify the value for rsc_fpops_bound for the long running faah workunits so that it is something larger like 2000000000000000. Then start things up again. This should increase the cpu limit. Unfortunately we cannot send updated values for this from the server.
We normally load workunits onto the grid with a value of rsc_fpops_bound of 10 times rsc_fpops_est.

Note: You can examine client_state.xml while BOINC is running, but do not alter it and save the changes until you do "shut down connected client", exit BOINC manager and start editing fresh versions of client_state.xml and client_state_prev.xml.
My example: Here are the relevant lines in my client_state.xml:
<p_fpops>2457680250.783699</p_fpops> and
<workunit> / <name>E000035_543A_00045d00h</name> / ... / <rsc_fpops_bound>1025686081987750.000000</rsc_fpops_bound>
These are big numbers, so move the decimal points 9 places to the left in both:
_bound / p_fpops = 1025686 / 2.458 = 416946 seconds = 116 hours = safe, so I don't have to shut down BOINC to alter the files.

Note to WCG FAQ maintainer: knreed's post may disagree with the external BOINC FAQ referenced by the WCG FAQ Index: 4K. exceeded CPU time limit xxxxxxxx.xxxxxx
----------------------------------------
[Edit 5 times, last edit by Rickjb at Dec 18, 2008 5:38:09 AM]
[Dec 18, 2008 4:23:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Error after 55 hours of crunching

The external FAQ is wrong; changing the bound is safe if Kevin's instructions are followed exactly.

I don't think there is a reason for anyone to do this, though. So far, totto73's report is unique.
[Dec 18, 2008 4:59:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 206
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Error after 55 hours of crunching

I'm going to duck out of this project until the problems have been sorted.
After too many anxious moments with long-running wus, i've finally had one error on me after 76 hours (my estimate was 90hrs, the longest I'd seen so far);

E000027_ 234A_ 00035p00o_ 2-- In Progress 12/18/08 23:37:38 12/20/08 09:13:38 0.00 0.0 / 0.0
E000027_ 234A_ 00035p00o_ 0-- Pending Validation 12/13/08 23:44:03 12/17/08 10:58:03 31.10 641.9 / 0.0
E000027_ 234A_ 00035p00o_ 1-- Error 12/13/08 23:42:49 12/18/08 23:20:40 76.15 567.0 / 0.0

So three days for nuffink.... In the messages it sez;

18/12/2008 23:19:28|World Community Grid|Aborting task E000027_234A_00035p00o_1: exceeded CPU time limit 298094.146307
18/12/2008 23:19:33|World Community Grid|Computation for task E000027_234A_00035p00o_1 finished

I've never had this problem before (my machine's a P4 3GHz with 2GB ram), but it's too late to follow knreeds suggestion. I'll just go and do something else instead.

In the meantime, I'll keep checking the notices to see when it's safe to return.
----------------------------------------

[Dec 19, 2008 9:24:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread