Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Discovering Dengue Drugs - Together - Phase 2 Forum Thread: extremely long running DDD2 w/u |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 37
|
Author |
|
Powhatan
Advanced Cruncher Joined: Oct 20, 2009 Post Count: 58 Status: Offline Project Badges: |
ts02_ c283_ sr45a1_ 0-- patawomeck In Progress 2/1/11 13:52:50 2/11/11 13:52:50 0.00 0.0 / 0.0 Wingmen have errors for the first two, PV for the last one. I'm aborting the first two and continuing with the last one. Need to fill the cache back up. Thanks SekeRob.ts02_ c283_ sr45a0_ 1-- patawomeck In Progress 2/1/11 13:52:50 2/11/11 13:52:50 0.00 0.0 / 0.0 ts02_ c284_ sr34b1_ 1-- patawomeck In Progress 2/1/11 13:55:45 2/11/11 13:55:45 0.00 0.0 / 0.0 |
||
|
JollyJimmy
Advanced Cruncher USA Joined: Aug 23, 2005 Post Count: 115 Status: Offline Project Badges: |
For some, including me it kicked the task out of an endless loop and let them finish normal. I smell a small run of betas in the near future!Would be nice if those of us who encountered the problem will get first pick on the betas, though. Of course, that would be strictly for the reason that apparently the problem is more likely to occur on our machines than on the "others". Right? Right!!! |
||
|
RaymondFO
Veteran Cruncher USA Joined: Nov 30, 2004 Post Count: 561 Status: Offline Project Badges: |
I picked up three of the those units and they all resulted in errors. The Linux box error message ended with "INFO: No state to restore. Start from the beginning. "
Work units in question: ts02_c237_sr23b1 ts02_c283_sr34a0 ts02_c223_sr91a0 <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> INFO: No state to restore. Start from the beginning. Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x758522A1 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 6.3.3 Dump Timestamp : 02/03/11 11:59:23 Install Directory : C:\Program Files (x86)\BOINC\ Data Directory : C:\ProgramData\BOINC Project Symstore : Loaded Library : C:\Program Files (x86)\BOINC\\dbghelp.dll Loaded Library : C:\Program Files (x86)\BOINC\\symsrv.dll Loaded Library : C:\Program Files (x86)\BOINC\\srcsrv.dll LoadLibraryA( C:\Program Files (x86)\BOINC\\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 Symbol Search Path: C:\ProgramData\BOINC\slots\8;C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org ModLoad: 00400000 16445000 C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org\wcg_dddt2_charmm_6.17_windows_intelx86 (-nosymbols- Symbols Loaded) ModLoad: 77750000 00180000 C:\Windows\SysWOW64\ntdll.dll (6.1.7600.16559) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75a80000 00100000 C:\Windows\syswow64\kernel32.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75840000 00046000 C:\Windows\syswow64\KERNELBASE.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 73ad0000 00009000 C:\Windows\system32\VERSION.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75670000 000ac000 C:\Windows\syswow64\msvcrt.dll (7.0.7600.16385) (-exported- Symbols Loaded) File Version : 7.0.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 7.0.7600.16385 ModLoad: 76cc0000 00100000 C:\Windows\syswow64\USER32.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75730000 00090000 C:\Windows\syswow64\GDI32.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75720000 0000a000 C:\Windows\syswow64\LPK.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75350000 0009d000 C:\Windows\syswow64\USP10.dll (1.626.7600.16385) (-exported- Symbols Loaded) File Version : 1.0626.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft(R) Uniscribe Unicode script processor Product Version : 1.0626.7600.16385 ModLoad: 75c10000 000a0000 C:\Windows\syswow64\ADVAPI32.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 75820000 00019000 C:\Windows\SysWOW64\sechost.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 754b0000 000f0000 C:\Windows\syswow64\RPCRT4.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 752c0000 00060000 C:\Windows\syswow64\SspiCli.dll (6.1.7600.16484) (-exported- Symbols Loaded) File Version : 6.1.7600.16484 (win7_gdr.091210-1534) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16484 ModLoad: 752b0000 0000c000 C:\Windows\syswow64\CRYPTBASE.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 753f0000 0002a000 C:\Windows\syswow64\imagehlp.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 757c0000 00060000 C:\Windows\system32\IMM32.DLL (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 755a0000 000cc000 C:\Windows\syswow64\MSCTF.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 739c0000 00021000 C:\Windows\system32\ntmarta.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 77280000 00045000 C:\Windows\syswow64\WLDAP32.dll (6.1.7600.16385) (-exported- Symbols Loaded) File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : MicrosoftWindowsOperating System Product Version : 6.1.7600.16385 ModLoad: 74650000 00115000 C:\Program Files (x86)\BOINC\dbghelp.dll (6.8.4.0) (-exported- Symbols Loaded) File Version : 6.8.0004.0 (debuggers(dbg).070515-1751) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 748e0000 00048000 C:\Program Files (x86)\BOINC\symsrv.dll (6.8.4.0) (-exported- Symbols Loaded) File Version : 6.8.0004.0 (debuggers(dbg).070515-1751) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 74e50000 0003b000 C:\Program Files (x86)\BOINC\srcsrv.dll (6.8.4.0) (-exported- Symbols Loaded) File Version : 6.8.0004.0 (debuggers(dbg).070515-1751) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 0, Write: 0, Other 0 - I/O Transfers Counters - Read: 0, Write: 0, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 0, QuotaPeakPagedPoolUsage: 0 QuotaNonPagedPoolUsage: 0, QuotaPeakNonPagedPoolUsage: 0 - Virtual Memory Usage - VirtualSize: 0, PeakVirtualSize: 0 - Pagefile Usage - PagefileUsage: 0, PeakPagefileUsage: 0 - Working Set Size - WorkingSetSize: 0, PeakWorkingSetSize: 0, PageFaultCount: 0 *** Dump of thread ID 4856 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x758522A1 - Registers - eax=00000000 ebx=00000000 ecx=00b2a746 edx=1e68613c esi=75a910ef edi=00000001 eip=758522a1 esp=1e68fb70 ebp=1e68ff94 cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246 - Callstack - ChildEBP RetAddr Args to Child 1e68ff94 77789d42 00000000 4776ba71 00000000 00000000 KERNELBASE!DebugBreak+0x0 1e68ffd4 77789d15 0043f280 00000000 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0 1e68ffec 00000000 0043f280 00000000 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0 *** Dump of thread ID 3516 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Registers - eax=11915ae0 ebx=11a00860 ecx=00000420 edx=000001ee esi=11aeb5e0 edi=210d1920 eip=0079290c esp=1884cdc0 ebp=1884d1e8 cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246 - Callstack - ChildEBP RetAddr Args to Child 1884d1e8 004a35a9 1884d7a0 1884d7a8 11c4bc20 138e7654 wcg_dddt2_charmm_6!+0x0 1884d578 009f93e1 1884d7a8 1884d7a0 138e4d80 00be24e8 wcg_dddt2_charmm_6!+0x0 1884d7e8 008b3126 1341dc40 135089c0 135f3740 11915ae0 wcg_dddt2_charmm_6!+0x0 1884da28 00983847 1341dc40 135089c0 135f3740 11915ae0 wcg_dddt2_charmm_6!+0x0 1884e128 0066ee5a 1231ded8 1231b790 12319048 133a8580 wcg_dddt2_charmm_6!+0x0 1884ebd8 006683d6 1884ec28 00c9c5c0 00000036 1232a340 wcg_dddt2_charmm_6!+0x0 1884ee8c 00447312 14204000 138e61e0 00001388 00000020 wcg_dddt2_charmm_6!+0x0 1884f018 00445640 00b2b3b0 138e61e0 ffffffff 00445640 wcg_dddt2_charmm_6!+0x0 1884f214 0042d6e2 009d0013 00a20012 00000001 00410040 wcg_dddt2_charmm_6!+0x0 1884feac 00b03052 0000001c 00141498 001416c8 00000094 wcg_dddt2_charmm_6!+0x0 1884ff88 75a93677 7efde000 1884ffd4 77789d42 7efde000 wcg_dddt2_charmm_6!+0x0 1884ff94 77789d42 7efde000 419aba71 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 1884ffd4 77789d15 00b02ee6 7efde000 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0 1884ffec 00000000 00b02ee6 7efde000 00000000 235b3003 ntdll!RtlInitializeExceptionChain+0x0 *** Dump of thread ID 3392 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Registers - eax=00b05e35 ebx=00000000 ecx=00000000 edx=00000000 esi=000000e4 edi=00000000 eip=7776f861 esp=2068fea4 ebp=2068ff10 cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202 - Callstack - ChildEBP RetAddr Args to Child 2068ff10 75a91184 000000e4 ffffffff 00000000 1886e020 ntdll!NtWaitForSingleObject+0x0 2068ff28 75a91138 000000e4 ffffffff 00000000 2068ff54 kernel32!WaitForSingleObjectEx+0x0 2068ff3c 00ac5d2f 000000e4 ffffffff 0014e2f0 0014e2f0 kernel32!WaitForSingleObject+0x0 2068ff54 00b05ea1 1886e020 00000000 00000000 0014e2f0 wcg_dddt2_charmm_6!+0x0 2068ff88 75a93677 0014e2f0 2068ffd4 77789d42 0014e2f0 wcg_dddt2_charmm_6!+0x0 2068ff94 77789d42 0014e2f0 7976ba71 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 2068ffd4 77789d15 00b05e35 0014e2f0 00000000 00000000 ntdll!RtlInitializeExceptionChain+0x0 2068ffec 00000000 00b05e35 0014e2f0 00000000 20da0000 ntdll!RtlInitializeExceptionChain+0x0 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... </stderr_txt> ]]> |
||
|
themoonscrescent
Veteran Cruncher UK Joined: Jul 1, 2006 Post Count: 1320 Status: Offline Project Badges: |
I've got a work unit: tso2_c283_sr91b0_0
----------------------------------------5 hrs 20 into it, 4.166% complete. 11hrs 20 (and growing), to completion, my Wingman's reading it done in 1.00. I have just restarted my system to see if that'll get it moving and nothing so far (albeit, only about 5 mins ago), should I let it run or abort it? Cheers. |
||
|
GB033533
Senior Cruncher UK Joined: Dec 8, 2004 Post Count: 198 Status: Offline Project Badges: |
should I let it run or abort it? The techs have yet to decide, but I'd suggest letting it run. It should time out after 12-13 hours... |
||
|
themoonscrescent
Veteran Cruncher UK Joined: Jul 1, 2006 Post Count: 1320 Status: Offline Project Badges: |
Thanks for your post, have let it continue, now up to 5.166% @ 6hrs 50+
----------------------------------------D. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
tmc, at that rate it's not going to make it. Plz check out wingman and consider to abort. Nothing to be won from 12-14 hours run time ending in Maximum Elapsed Time Exceeded.
ttyl |
||
|
themoonscrescent
Veteran Cruncher UK Joined: Jul 1, 2006 Post Count: 1320 Status: Offline Project Badges: |
Just to update:
----------------------------------------I aborted the work unit as per Seke's advice, the work unit did validate with 2 wingmen, taking a an hour for one and a little over 1 hour for the other. A third Wingman errored out at 20.50 hours, so I'm happy I did abort it, for some reason it just didn't like our systems?. |
||
|
GB033533
Senior Cruncher UK Joined: Dec 8, 2004 Post Count: 198 Status: Offline Project Badges: |
....and the third wingman at 20.5 hrs got nothing for his efforts? In which case you were wise to abort! but had you persevered to your error, would you both have received some credit?
----------------------------------------I've had two of these that errored at 13hrs, and I'm hoping the remaining original wingmen see them through. The repair wingmen for both my errors completed in a normal time. So it seems to hang by a thread whether I get credit or not... Like you said, they just didn't like our systems. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sorry, but if the wingman sees them through in 1 hour and claims 15 credit, and the repair man also manages in 1 hour and claims 17, how can an anyway faulty task of 12 hours be of any satisfaction at all?
Just this morning happened to catch one ''sr'' again in the act at 0.833% after 1.5 hours, checkpointing every 18 minutes and the second ''pr'' on the other core at 1.5 hours at 30%. Send out to boot-camp without looking **, as ''the trick'' did not cause it to pick up. The other 10 hours are used for good results, which is what the crunch about, last I re-reflected on what the root source of the efforts are. ** Did look and 1 succeeded and 3 others had the infamous 'max-out' error after 13 hours claiming each 240 credit. No more copies created after my abort. |
||
|
|