Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 37
Posts: 37   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3221 times and has 36 replies Next Thread
Powhatan
Advanced Cruncher
Joined: Oct 20, 2009
Post Count: 58
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

ts02_ c283_ sr45a1_ 0-- patawomeck In Progress 2/1/11 13:52:50 2/11/11 13:52:50 0.00 0.0 / 0.0
ts02_ c283_ sr45a0_ 1-- patawomeck In Progress 2/1/11 13:52:50 2/11/11 13:52:50 0.00 0.0 / 0.0
ts02_ c284_ sr34b1_ 1-- patawomeck In Progress 2/1/11 13:55:45 2/11/11 13:55:45 0.00 0.0 / 0.0
Wingmen have errors for the first two, PV for the last one. I'm aborting the first two and continuing with the last one. Need to fill the cache back up. biggrin Thanks SekeRob.
[Feb 4, 2011 3:30:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JollyJimmy
Advanced Cruncher
USA
Joined: Aug 23, 2005
Post Count: 115
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

For some, including me it kicked the task out of an endless loop and let them finish normal.
shhhI smell a small run of betas in the near future!shhh
Would be nice if those of us who encountered the problem will get first pick on the betas, though. silly
Of course, that would be strictly for the reason that apparently the problem is more likely to occur on our machines than on the "others". Right? Right!!! liar
----------------------------------------
[Feb 4, 2011 5:54:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RaymondFO
Veteran Cruncher
USA
Joined: Nov 30, 2004
Post Count: 561
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

I picked up three of the those units and they all resulted in errors. The Linux box error message ended with "INFO: No state to restore. Start from the beginning. "

Work units in question:
ts02_c237_sr23b1
ts02_c283_sr34a0
ts02_c223_sr91a0

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>
INFO: No state to restore. Start from the beginning.


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x758522A1

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 6.3.3


Dump Timestamp : 02/03/11 11:59:23
Install Directory : C:\Program Files (x86)\BOINC\
Data Directory : C:\ProgramData\BOINC
Project Symstore :
Loaded Library : C:\Program Files (x86)\BOINC\\dbghelp.dll
Loaded Library : C:\Program Files (x86)\BOINC\\symsrv.dll
Loaded Library : C:\Program Files (x86)\BOINC\\srcsrv.dll
LoadLibraryA( C:\Program Files (x86)\BOINC\\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:\ProgramData\BOINC\slots\8;C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org


ModLoad: 00400000 16445000 C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org\wcg_dddt2_charmm_6.17_windows_intelx86 (-nosymbols- Symbols Loaded)

ModLoad: 77750000 00180000 C:\Windows\SysWOW64\ntdll.dll (6.1.7600.16559) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75a80000 00100000 C:\Windows\syswow64\kernel32.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75840000 00046000 C:\Windows\syswow64\KERNELBASE.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 73ad0000 00009000 C:\Windows\system32\VERSION.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75670000 000ac000 C:\Windows\syswow64\msvcrt.dll (7.0.7600.16385) (-exported- Symbols Loaded)
File Version : 7.0.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 7.0.7600.16385

ModLoad: 76cc0000 00100000 C:\Windows\syswow64\USER32.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75730000 00090000 C:\Windows\syswow64\GDI32.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75720000 0000a000 C:\Windows\syswow64\LPK.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75350000 0009d000 C:\Windows\syswow64\USP10.dll (1.626.7600.16385) (-exported- Symbols Loaded)
File Version : 1.0626.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft(R) Uniscribe Unicode script processor
Product Version : 1.0626.7600.16385

ModLoad: 75c10000 000a0000 C:\Windows\syswow64\ADVAPI32.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 75820000 00019000 C:\Windows\SysWOW64\sechost.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 754b0000 000f0000 C:\Windows\syswow64\RPCRT4.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 752c0000 00060000 C:\Windows\syswow64\SspiCli.dll (6.1.7600.16484) (-exported- Symbols Loaded)
File Version : 6.1.7600.16484 (win7_gdr.091210-1534)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16484

ModLoad: 752b0000 0000c000 C:\Windows\syswow64\CRYPTBASE.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 753f0000 0002a000 C:\Windows\syswow64\imagehlp.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 757c0000 00060000 C:\Windows\system32\IMM32.DLL (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 755a0000 000cc000 C:\Windows\syswow64\MSCTF.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 739c0000 00021000 C:\Windows\system32\ntmarta.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 77280000 00045000 C:\Windows\syswow64\WLDAP32.dll (6.1.7600.16385) (-exported- Symbols Loaded)
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : MicrosoftWindowsOperating System
Product Version : 6.1.7600.16385

ModLoad: 74650000 00115000 C:\Program Files (x86)\BOINC\dbghelp.dll (6.8.4.0) (-exported- Symbols Loaded)
File Version : 6.8.0004.0 (debuggers(dbg).070515-1751)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 748e0000 00048000 C:\Program Files (x86)\BOINC\symsrv.dll (6.8.4.0) (-exported- Symbols Loaded)
File Version : 6.8.0004.0 (debuggers(dbg).070515-1751)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 74e50000 0003b000 C:\Program Files (x86)\BOINC\srcsrv.dll (6.8.4.0) (-exported- Symbols Loaded)
File Version : 6.8.0004.0 (debuggers(dbg).070515-1751)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 0, Write: 0, Other 0

- I/O Transfers Counters -
Read: 0, Write: 0, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 0, QuotaPeakPagedPoolUsage: 0
QuotaNonPagedPoolUsage: 0, QuotaPeakNonPagedPoolUsage: 0

- Virtual Memory Usage -
VirtualSize: 0, PeakVirtualSize: 0

- Pagefile Usage -
PagefileUsage: 0, PeakPagefileUsage: 0

- Working Set Size -
WorkingSetSize: 0, PeakWorkingSetSize: 0, PageFaultCount: 0

*** Dump of thread ID 4856 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x758522A1

- Registers -
eax=00000000 ebx=00000000 ecx=00b2a746 edx=1e68613c esi=75a910ef edi=00000001
eip=758522a1 esp=1e68fb70 ebp=1e68ff94
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246

- Callstack -
ChildEBP RetAddr Args to Child
1e68ff94 77789d42 00000000 4776ba71 00000000 00000000 KERNELBASE!DebugBreak+0x0
1e68ffd4 77789d15 0043f280 00000000 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0
1e68ffec 00000000 0043f280 00000000 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0

*** Dump of thread ID 3516 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Registers -
eax=11915ae0 ebx=11a00860 ecx=00000420 edx=000001ee esi=11aeb5e0 edi=210d1920
eip=0079290c esp=1884cdc0 ebp=1884d1e8
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246

- Callstack -
ChildEBP RetAddr Args to Child
1884d1e8 004a35a9 1884d7a0 1884d7a8 11c4bc20 138e7654 wcg_dddt2_charmm_6!+0x0
1884d578 009f93e1 1884d7a8 1884d7a0 138e4d80 00be24e8 wcg_dddt2_charmm_6!+0x0
1884d7e8 008b3126 1341dc40 135089c0 135f3740 11915ae0 wcg_dddt2_charmm_6!+0x0
1884da28 00983847 1341dc40 135089c0 135f3740 11915ae0 wcg_dddt2_charmm_6!+0x0
1884e128 0066ee5a 1231ded8 1231b790 12319048 133a8580 wcg_dddt2_charmm_6!+0x0
1884ebd8 006683d6 1884ec28 00c9c5c0 00000036 1232a340 wcg_dddt2_charmm_6!+0x0
1884ee8c 00447312 14204000 138e61e0 00001388 00000020 wcg_dddt2_charmm_6!+0x0
1884f018 00445640 00b2b3b0 138e61e0 ffffffff 00445640 wcg_dddt2_charmm_6!+0x0
1884f214 0042d6e2 009d0013 00a20012 00000001 00410040 wcg_dddt2_charmm_6!+0x0
1884feac 00b03052 0000001c 00141498 001416c8 00000094 wcg_dddt2_charmm_6!+0x0
1884ff88 75a93677 7efde000 1884ffd4 77789d42 7efde000 wcg_dddt2_charmm_6!+0x0
1884ff94 77789d42 7efde000 419aba71 00000000 00000000 kernel32!BaseThreadInitThunk+0x0
1884ffd4 77789d15 00b02ee6 7efde000 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0
1884ffec 00000000 00b02ee6 7efde000 00000000 235b3003 ntdll!RtlInitializeExceptionChain+0x0

*** Dump of thread ID 3392 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Registers -
eax=00b05e35 ebx=00000000 ecx=00000000 edx=00000000 esi=000000e4 edi=00000000
eip=7776f861 esp=2068fea4 ebp=2068ff10
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202

- Callstack -
ChildEBP RetAddr Args to Child
2068ff10 75a91184 000000e4 ffffffff 00000000 1886e020 ntdll!NtWaitForSingleObject+0x0
2068ff28 75a91138 000000e4 ffffffff 00000000 2068ff54 kernel32!WaitForSingleObjectEx+0x0
2068ff3c 00ac5d2f 000000e4 ffffffff 0014e2f0 0014e2f0 kernel32!WaitForSingleObject+0x0
2068ff54 00b05ea1 1886e020 00000000 00000000 0014e2f0 wcg_dddt2_charmm_6!+0x0
2068ff88 75a93677 0014e2f0 2068ffd4 77789d42 0014e2f0 wcg_dddt2_charmm_6!+0x0
2068ff94 77789d42 0014e2f0 7976ba71 00000000 00000000 kernel32!BaseThreadInitThunk+0x0
2068ffd4 77789d15 00b05e35 0014e2f0 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0
2068ffec 00000000 00b05e35 0014e2f0 00000000 20da0000 ntdll!RtlInitializeExceptionChain+0x0


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>
[Feb 4, 2011 7:16:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
themoonscrescent
Veteran Cruncher
UK
Joined: Jul 1, 2006
Post Count: 1320
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

I've got a work unit: tso2_c283_sr91b0_0

5 hrs 20 into it, 4.166% complete. 11hrs 20 (and growing), to completion, my Wingman's reading it done in 1.00.

I have just restarted my system to see if that'll get it moving and nothing so far (albeit, only about 5 mins ago), should I let it run or abort it?

Cheers.
----------------------------------------


[Feb 5, 2011 3:13:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 198
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

should I let it run or abort it?


The techs have yet to decide, but I'd suggest letting it run. It should time out after 12-13 hours...
----------------------------------------

[Feb 5, 2011 3:22:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
themoonscrescent
Veteran Cruncher
UK
Joined: Jul 1, 2006
Post Count: 1320
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

Thanks for your post, have let it continue, now up to 5.166% @ 6hrs 50+

D.
----------------------------------------


[Feb 5, 2011 4:54:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

tmc, at that rate it's not going to make it. Plz check out wingman and consider to abort. Nothing to be won from 12-14 hours run time ending in Maximum Elapsed Time Exceeded.

ttyl
[Feb 5, 2011 5:05:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
themoonscrescent
Veteran Cruncher
UK
Joined: Jul 1, 2006
Post Count: 1320
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

Just to update:

I aborted the work unit as per Seke's advice, the work unit did validate with 2 wingmen, taking a an hour for one and a little over 1 hour for the other.

A third Wingman errored out at 20.50 hours, so I'm happy I did abort it, for some reason it just didn't like our systems?.
----------------------------------------


[Feb 7, 2011 12:50:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
GB033533
Senior Cruncher
UK
Joined: Dec 8, 2004
Post Count: 198
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

....and the third wingman at 20.5 hrs got nothing for his efforts? In which case you were wise to abort! but had you persevered to your error, would you both have received some credit?

I've had two of these that errored at 13hrs, and I'm hoping the remaining original wingmen see them through. The repair wingmen for both my errors completed in a normal time. So it seems to hang by a thread whether I get credit or not...

Like you said, they just didn't like our systems.
----------------------------------------

[Feb 7, 2011 3:08:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: extremely long running w/u

Sorry, but if the wingman sees them through in 1 hour and claims 15 credit, and the repair man also manages in 1 hour and claims 17, how can an anyway faulty task of 12 hours be of any satisfaction at all?

Just this morning happened to catch one ''sr'' again in the act at 0.833% after 1.5 hours, checkpointing every 18 minutes and the second ''pr'' on the other core at 1.5 hours at 30%. Send out to boot-camp without looking **, as ''the trick'' did not cause it to pick up. The other 10 hours are used for good results, which is what the crunch about, last I re-reflected on what the root source of the efforts are.

** Did look and 1 succeeded and 3 others had the infamous 'max-out' error after 13 hours claiming each 240 credit. No more copies created after my abort.
[Feb 7, 2011 4:01:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 37   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread