| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 264
|
|
| Author |
|
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8979 Status: Offline Project Badges:
|
Those feeling the urge to abort, BEFORE aborting visit the proper slot [select task, hit properties tells which slot], then take a copy of the stderr.txt. Logs that come from an abort have lots missing, and include that MS file dump, which is just the result of the abort. Computer: Bird Project World Community Grid Name MCM1_0000006_1183_0 Application mcm1 7.24 Workunit name MCM1_0000006_1183 State Running Received 11/8/2013 8:02:42 PM Report deadline 11/18/2013 8:02:19 PM Estimated app speed 2.56 GFLOPs/sec Estimated task size 49,000 GFLOPs CPU time at last checkpoint 00:00:00 CPU time 04:29:59 Elapsed time 04:52:42 Estimated time remaining 00:00:00 Fraction done 100.000% Virtual memory size 47.43 MB Working set size 49.45 MB Directory slots/12 Process ID 2332 World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM 04:57:42 (04:34:50) - Running 100.00 92.321 [0] 04:34:50 Bird 11/18/2013 8:02:19 PM 49.45 MB 47.43 MB Computer: Bird Project World Community Grid Name MCM1_0000006_1183_0 Application mcm1 7.24 Workunit name MCM1_0000006_1183 State Waiting to run, Suspended by user Received 11/8/2013 8:02:42 PM Report deadline 11/18/2013 8:02:19 PM Estimated app speed 2.56 GFLOPs/sec Estimated task size 49,000 GFLOPs CPU time at last checkpoint 00:00:00 CPU time 04:37:19 Elapsed time 05:00:16 Estimated time remaining 00:00:00 Fraction done 100.000% Virtual memory size 47.43 MB Working set size 49.45 MB Directory slots/12 Process ID 2332 World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM 05:00:16 (04:37:19) - Waiting to run 100.00 92.358 [0] 04:37:19 Bird 11/18/2013 8:02:19 PM Computer: Bird Project World Community Grid Name MCM1_0000006_1183_0 Application mcm1 7.24 Workunit name MCM1_0000006_1183 State Waiting to run Received 11/8/2013 8:02:42 PM Report deadline 11/18/2013 8:02:19 PM Estimated app speed 2.56 GFLOPs/sec Estimated task size 49,000 GFLOPs CPU time at last checkpoint 00:00:00 CPU time 04:37:19 Elapsed time 05:00:16 Estimated time remaining 00:00:00 Fraction done 100.000% Virtual memory size 47.43 MB Working set size 49.45 MB Directory slots/12 Process ID 2332 World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM 05:00:30 (04:37:33) - Running 100.00 92.362 [0] 04:37:33 Bird 11/18/2013 8:02:19 PM 49.45 MB 47.43 MB Computer: Bird Project World Community Grid Name MCM1_0000006_1183_0 Application mcm1 7.24 Workunit name MCM1_0000006_1183 State Running Received 11/8/2013 8:02:42 PM Report deadline 11/18/2013 8:02:19 PM Estimated app speed 2.56 GFLOPs/sec Estimated task size 49,000 GFLOPs CPU time at last checkpoint 00:00:00 CPU time 04:37:55 Elapsed time 05:00:53 Estimated time remaining 00:00:00 Fraction done 100.000% Virtual memory size 47.43 MB Working set size 49.45 MB Directory slots/12 Process ID 2332 Computer: Bird Project World Community Grid Name MCM1_0000006_1183_0 Application mcm1 7.24 Workunit name MCM1_0000006_1183 State Aborted Received 11/8/2013 8:02:42 PM Report deadline 11/18/2013 8:02:19 PM Estimated app speed 2.56 GFLOPs/sec Estimated task size 49,000 GFLOPs CPU time at last checkpoint 00:00:00 CPU time 04:39:38 Elapsed time 05:02:37 Estimated time remaining 00:00:00 Fraction done 100% Virtual memory size 0.00 MB Working set size 0.00 MB World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM05:02:37 (04:39:38) -Aborted (203) 100.00 92.408 Bird 11/18/2013 8:02:19 PM Will provide additional info when WGG stats update finishes... ![]() [Edit 1 times, last edit by yoro42 at Nov 11, 2013 1:25:08 AM] |
||
|
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8979 Status: Offline Project Badges:
|
World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM05:02:37 (04:39:38) -Aborted (203) 100.00 92.408 Bird 11/18/2013 8:02:19 PM Will provide additional info when WGG stats update finishes... Result Log Result Name: MCM1_ 0000006_ 1183_ 0-- <core_client_version>7.2.28</core_client_version> <![CDATA[ <message> aborted by user </message> <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt Initializing wcg_learn_limit = 500000 Running Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x000007FEFD363CA2 Engaging BOINC Windows Runtime Debugger... ******************** BOINC Windows Runtime Debugger Version 7.2.4 Dump Timestamp : 11/10/13 18:05:16 Install Directory : C:\Program Files\BOINC\ Data Directory : C:\ProgramData\BOINC Project Symstore : Loaded Library : C:\Program Files\BOINC\\dbghelp.dll Loaded Library : C:\Program Files\BOINC\\symsrv.dll Loaded Library : C:\Program Files\BOINC\\srcsrv.dll LoadLibraryA( C:\Program Files\BOINC\\version.dll ): GetLastError = 126 Loaded Library : version.dll Debugger Engine : 4.0.5.0 Symbol Search Path: C:\ProgramData\BOINC\slots\12;C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org ModLoad: 000000003fd00000 00000000001ad000 C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org\wcgrid_mcm1_7.24_windows_x86_64 (-exported- Symbols Loaded) Linked PDB Filename : c:\projects\wcgridAustinWorkspace\scienceApps\MCM1\x64\Release\wcgrid_mcm1_prod_64.pdb ModLoad: 00000000771d0000 00000000001a9000 C:\Windows\SYSTEM32\ntdll.dll (6.1.7601.18247) (-exported- Symbols Loaded) Linked PDB Filename : ntdll.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 0000000076fb0000 000000000011f000 C:\Windows\system32\kernel32.dll (6.1.7601.18229) (-exported- Symbols Loaded) Linked PDB Filename : kernel32.pdb File Version : 6.1.7601.18015 (win7sp1_gdr.121129-1432) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18015 ModLoad: 00000000fd330000 000000000006b000 C:\Windows\system32\KERNELBASE.dll (6.1.7601.18229) (-exported- Symbols Loaded) Linked PDB Filename : kernelbase.pdb File Version : 6.1.7601.18015 (win7sp1_gdr.121129-1432) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18015 ModLoad: 00000000fc0c0000 000000000000c000 C:\Windows\system32\VERSION.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : version.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fe960000 000000000009f000 C:\Windows\system32\msvcrt.dll (7.0.7601.17744) (-exported- Symbols Loaded) Linked PDB Filename : msvcrt.pdb File Version : 7.0.7601.17744 (win7sp1_gdr.111215-1535) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 7.0.7601.17744 ModLoad: 0000000077390000 0000000000007000 C:\Windows\system32\PSAPI.DLL (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : psapi.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000770d0000 00000000000fa000 C:\Windows\system32\USER32.dll (6.1.7601.17514) (-exported- Symbols Loaded) Linked PDB Filename : user32.pdb File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.17514 ModLoad: 00000000fe770000 0000000000067000 C:\Windows\system32\GDI32.dll (6.1.7601.17514) (-exported- Symbols Loaded) Linked PDB Filename : gdi32.pdb File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.17514 ModLoad: 00000000feb60000 000000000000e000 C:\Windows\system32\LPK.dll (6.1.7601.18177) (-exported- Symbols Loaded) Linked PDB Filename : lpk.pdb File Version : 6.1.7601.18177 (win7sp1_gdr.130605-1534) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.18177 ModLoad: 00000000feb70000 00000000000c9000 C:\Windows\system32\USP10.dll (1.626.7601.18009) (-exported- Symbols Loaded) Linked PDB Filename : usp10.pdb File Version : 1.0626.7601.18009 (win7sp1_gdr.121121-1431) Company Name : Microsoft Corporation Product Name : Microsoft(R) Uniscribe Unicode script processor Product Version : 1.0626.7601.18009 ModLoad: 00000000fe7e0000 00000000000db000 C:\Windows\system32\ADVAPI32.dll (6.1.7601.18247) (-exported- Symbols Loaded) Linked PDB Filename : advapi32.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fe750000 000000000001f000 C:\Windows\SYSTEM32\sechost.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : sechost.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fec40000 000000000012d000 C:\Windows\system32\RPCRT4.dll (6.1.7601.18205) (-exported- Symbols Loaded) Linked PDB Filename : rpcrt4.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fd4f0000 0000000000d88000 C:\Windows\system32\SHELL32.dll (6.1.7601.18222) (-exported- Symbols Loaded) Linked PDB Filename : shell32.pdb File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7601.17514 ModLoad: 00000000fe6d0000 0000000000071000 C:\Windows\system32\SHLWAPI.dll (6.1.7601.17514) (-exported- Symbols Loaded) Linked PDB Filename : shlwapi.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fe360000 000000000002e000 C:\Windows\system32\IMM32.DLL (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : imm32.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fd3e0000 0000000000109000 C:\Windows\system32\MSCTF.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : msctf.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000faba0000 000000000002d000 C:\Windows\system32\ntmarta.dll (6.1.7600.16385) (-exported- Symbols Loaded) Linked PDB Filename : ntmarta.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 00000000fe670000 0000000000052000 C:\Windows\system32\WLDAP32.dll (6.1.7601.17514) (-exported- Symbols Loaded) Linked PDB Filename : wldap32.pdb File Version : 6.1.7600.16385 (win7_rtm.090713-1255) Company Name : Microsoft Corporation Product Name : Microsoft® Windows® Operating System Product Version : 6.1.7600.16385 ModLoad: 0000000072580000 000000000015e000 C:\Program Files\BOINC\dbghelp.dll (6.8.4.0) (-exported- Symbols Loaded) Linked PDB Filename : dbghelp.pdb File Version : 6.8.0004.0 (debuggers(dbg).070519-0745) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 00000000730a0000 000000000004e000 C:\Program Files\BOINC\symsrv.dll (6.8.4.0) (-exported- Symbols Loaded) Linked PDB Filename : symsrv.pdb File Version : 6.8.0004.0 (debuggers(dbg).070519-0745) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 ModLoad: 0000000073060000 000000000003e000 C:\Program Files\BOINC\srcsrv.dll (6.8.4.0) (-exported- Symbols Loaded) Linked PDB Filename : srcsrv.pdb File Version : 6.8.0004.0 (debuggers(dbg).070519-0745) Company Name : Microsoft Corporation Product Name : Debugging Tools for Windows(R) Product Version : 6.8.0004.0 *** Dump of the Process Statistics: *** - I/O Operations Counters - Read: 0, Write: 0, Other 0 - I/O Transfers Counters - Read: 0, Write: 0, Other 0 - Paged Pool Usage - QuotaPagedPoolUsage: 0, QuotaPeakPagedPoolUsage: 0 QuotaNonPagedPoolUsage: 0, QuotaPeakNonPagedPoolUsage: 0 - Virtual Memory Usage - VirtualSize: 0, PeakVirtualSize: 0 - Pagefile Usage - PagefileUsage: 0, PeakPagefileUsage: 0 - Working Set Size - WorkingSetSize: 0, PeakWorkingSetSize: 0, PageFaultCount: 0 *** Dump of thread ID 7272 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x000007FEFD363CA2 - Registers - rax=0000000000000000 rbx=0000000000000000 rcx=000000003fe18950 rdx=000000003fe18948 rsi=0000000000000000 rdi=0000000000000000 r8=000000000225f210 r9=00000000c24468d0 r10=000000003fe18940 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 rip=00000000fd363ca2 rsp=000000000225f1d8 rbp=0000000000000000 cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246 - Callstack - ChildEBP RetAddr Args to Child 0225f1d0 3fd94cc3 3fe18950 3fe18948 0225f210 c24468d0 KERNELBASE!DebugBreak+0x0 0225f620 3fd93ec1 035950d4 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0225f880 3fd93db0 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0225f8b0 76fc652d 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0225f8e0 771fc541 00000000 00000000 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 0225f930 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0 *** Dump of thread ID 2008 (state: Initialized): *** - Information - Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000 - Registers - rax=00000000000000b7 rbx=0000000000002953 rcx=0000000000000003 rdx=0000000000000000 rsi=0000000001ca7d58 rdi=00000000000000b7 r8=00000000fffde000 r9=0000000000720006 r10=000000006c706572 r11=000000000027f170 r12=0000000000000c9c r13=0000000001ca7d58 r14=0000000000000006 r15=000000000000000b rip=0000000076fd34b0 rsp=000000000027f0b8 rbp=000000000228ac80 cs=0033 ss=002b ds=0000 es=0000 fs=0000 gs=0000 efl=00000206 - Callstack - ChildEBP RetAddr Args to Child 0027f0b0 3fdac312 00000000 3fda24a3 00000002 00000008 kernel32!FlsGetValue+0x0 0027f0e0 3fdac383 00002953 3fda041b 00000008 0027f3a0 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0027f110 3fda1e95 0000000b 00000004 01ca7d90 00004c54 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0027f140 3fd0b771 01ca7d40 00000004 0228ac80 3fda24a3 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0027f250 3fd249b6 00000001 0027f2f8 0027f2b0 0027f2b4 wcgrid_mcm1_7!+0x0 0027f380 3fd2443e 01e58b00 0027f3e0 00000001 0000000b wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0 0027f480 3fd24256 0000000f 01e55be0 00000005 00000005 wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0 0027f540 3fd45124 3fe664a0 01e55be0 00000005 0000000c wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0 0027f770 3fda27cb 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0 0027f7b0 76fc652d 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0 0027f7e0 771fc541 00000000 00000000 00000000 00000000 kernel32!BaseThreadInitThunk+0x0 0027f830 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0 *** Debug Message Dump **** *** Foreground Window Data *** Window Name : Window Class : Window Process ID: 0 Window Thread ID : 0 Exiting... </stderr_txt> ]]> ![]() [Edit 1 times, last edit by yoro42 at Nov 11, 2013 2:45:11 AM] |
||
|
|
cowtipperbs
Advanced Cruncher Joined: Aug 24, 2009 Post Count: 78 Status: Offline Project Badges:
|
Thanks for your response Lawrence, the amount of memory should not be the problem, running 16 gig. The last time I looked was only using around 6 gig.
----------------------------------------![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
yoro42,
The properties screen posts the slot info [bolded below], a sub directory of the BOINC data dir, which path is printed in the BOINC client startup messages. This is where the looked for stderr.txt file is. This file contains the Result log up to the point in time and could tell what's on with a [suspect] task. Name MCM1_0000006_1183_0 Application mcm1 7.24 Workunit name MCM1_0000006_1183 State Running Received 11/8/2013 8:02:42 PM Report deadline 11/18/2013 8:02:19 PM Estimated app speed 2.56 GFLOPs/sec Estimated task size 49,000 GFLOPs CPU time at last checkpoint 00:00:00 CPU time 04:37:55 Elapsed time 05:00:53 Estimated time remaining 00:00:00 Fraction done 100.000% Virtual memory size 47.43 MB Working set size 49.45 MB Directory slots/12 Process ID 2332 The 'aborted by user' log shows multiple initializations of the task. A clean run task without intervention would show it only once. This stderr.txt viewing, combined with what may be additive in the message/event log, will help to decide to what to do after observation: A) Do a "task suspend > wait a minute > resume" cycle with LAIM off (Leave application in memory when suspended). B) Soft boot cycle (OS Restart) C) Send it to never never land If you know all this, consider it for the benefit of other readers who don't... which is what most of my posts are meant to do :D |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I have on resent Beta 7.24 running for a very long time. Posting it here, because it should be the same version as current MCM1 public application. BETA_ BETA_ 9999961_ 0096_ 2-- - In Progress 11/9/13 13:29:41 11/11/13 13:29:41 0.00 0.0 / 0.0 <<moi BETA_ BETA_ 9999961_ 0096_ 0-- - No Reply 11/7/13 13:29:28 11/9/13 13:29:28 0.00 0.0 / 0.0 BETA_ BETA_ 9999961_ 0096_ 1-- 724 Pending Validation 11/7/13 13:29:27 11/9/13 04:16:50 38.04 407.8 / 0.0 CPU time is now 21.45 h. Progress 73.520 %, last checkpoint 19 hours ago. I think this is also how long the progress bar hasn't moved. The CPU time is counting up, which was not the case in the 7.21 Beta Windows error. I'm going to let it run for now. Specs: Linux Lubuntu 64-bit, 7.0.27 64-bit client, processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual-Core Processor TK-53 [Family 15 Model 104 Stepping 1]. Other MCM1 Betas took 7-9 hours on this notebook. Still nothing from this work unit. (info from properties+BoincTasks) est. computation size: 60506 GFLOPs virt. mem. size: 85.60 MB working set size: 43.61 MB application: 7.24 Beta Test name: BETA_BETA_9999961_0096_2 elapsed (cpu): 02d,00:38:39 (01d,22:41:13) (is still incrementing) last checkpoint at: 02:07:00 time since last checkpoint: [1] 01d,20:34:12 cpu%: 95,98 progress%: 73,520 (not incrementing) deadline: 17:11:43 11-11-2013 14:29 (which means overdue) state: Running High P. BETA_ BETA_ 9999961_ 0096_ 3-- - In Progress 11/11/13 13:29:48 11/13/13 13:29:48 0.00 0.0 / 0.0 BETA_ BETA_ 9999961_ 0096_ 2-- - No Reply 11/9/13 13:29:41 11/11/13 13:29:41 0.00 0.0 / 0.0 BETA_ BETA_ 9999961_ 0096_ 0-- - No Reply 11/7/13 13:29:28 11/9/13 13:29:28 0.00 0.0 / 0.0 BETA_ BETA_ 9999961_ 0096_ 1-- 724 Pending Validation 11/7/13 13:29:27 11/9/13 04:16:50 38.04 407.8 / 0.0 Can't copy/paste stderr.txt from here, it shows computing passes 0 and 1. Still let run like there's no tomorrow? In terms of deadline, there already is none. |
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
We are looking into the issues reported in this thread. We will let everyone know once we discover more. For the workunits continuing to run after showing 100% complete I would recomend letting those continue to run if they are continuing to increase in CPU time. It is likely just an issue with how we are calculating the percent complete. Also you can check the result status on the website to see if any wingmen have completed and the cpu time used. If you have concerns or they are not using CPU time you can try a suspend/resume and finally an abort.
Thanks, armstrdj |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Still nothing from this work unit. (info from properties+BoincTasks) est. computation size: 60506 GFLOPs virt. mem. size: 85.60 MB working set size: 43.61 MB application: 7.24 Beta Test name: BETA_BETA_9999961_0096_2 elapsed (cpu): 02d,00:38:39 (01d,22:41:13) (is still incrementing) last checkpoint at: 02:07:00 time since last checkpoint: [1] 01d,20:34:12 cpu%: 95,98 progress%: 73,520 (not incrementing) deadline: 17:11:43 11-11-2013 14:29 (which means overdue) state: Running High P. Update: After almost 53 hour of computation time and 2 days without any progress, the progress is now slowly moving forwards from time to time. I expect this work unit to be finished within the next few hours. Nothing new in stderr.txt and no checkpoints were set since then. Edit: Completed faster than expected: BETA_ BETA_ 9999961_ 0096_ 3-- - In Progress 11/11/13 13:29:48 11/13/13 13:29:48 0.00 0.0 / 0.0 BETA_ BETA_ 9999961_ 0096_ 2-- 724 Valid 11/9/13 13:29:41 11/11/13 20:22:00 52.73 140.1 / 273.9 BETA_ BETA_ 9999961_ 0096_ 0-- - No Reply 11/7/13 13:29:28 11/9/13 13:29:28 0.00 0.0 / 0.0 BETA_ BETA_ 9999961_ 0096_ 1-- 724 Valid 11/7/13 13:29:27 11/9/13 04:16:50 38.04 407.8 / 273.9 Result Name: BETA_ BETA_ 9999961_ 0096_ 1-- <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_beta17_7.24_x86_64-pc-linux-gnu -SettingsFile BETA_9999961_0096.txt -DatabaseFile dataset-GDS2771-v1.txt Initializing wcg_learn_limit = 750000 Running [15:46:04]: Computing pass 0 16:31:23 (6059): No heartbeat from client for 30 sec - exiting 16:31:23 (6059): timer handler: client dead, exiting Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_beta17_7.24_x86_64-pc-linux-gnu -SettingsFile BETA_9999961_0096.txt -DatabaseFile dataset-GDS2771-v1.txt Initializing wcg_learn_limit = 750000 Running [16:32:02]: Computing pass 0 INFO: WcgLearnLimit(750000) reached. 0.0017568562425581 0.0017568562425279 [17:33:10]: Computing pass 1 Result.out = 108486.000000 Run complete, CPU time: 136929.312595 06:32:50 (6118): called boinc_finish </stderr_txt> ]]> Result Name: BETA_ BETA_ 9999961_ 0096_ 2-- <core_client_version>7.0.27</core_client_version> <![CDATA[ <stderr_txt> Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_beta17_7.24_x86_64-pc-linux-gnu -SettingsFile BETA_9999961_0096.txt -DatabaseFile dataset-GDS2771-v1.txt Initializing wcg_learn_limit = 750000 Running [14:30:50]: Computing pass 0 INFO: WcgLearnLimit(750000) reached. 0.0017568562425581 0.0017568562425279 [16:48:32]: Computing pass 1 Result.out = 108486.000000 Run complete, CPU time: 189817.930879 21:21:03 (23395): called boinc_finish </stderr_txt> ]]> [Edit 1 times, last edit by Former Member at Nov 11, 2013 8:25:49 PM] |
||
|
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges:
|
For me this is day one of running "normal" priority work and I am not reporting a problem, rather reiterating the fact that these are a disperate bunch of wu's, some short and some very long. I have one on target for 28 hours runtime here. that is on 2.4gig E5 26xx on linux 64.
----------------------------------------Just sayin'. ![]() |
||
|
|
NixChix
Veteran Cruncher United States Joined: Apr 29, 2007 Post Count: 1187 Status: Offline Project Badges:
|
I am still seeing discrepancies between the reported elapsed time and the actual times from the logs, upto 10 hours, but not as large as I saw in the beta tests.
----------------------------------------I also am still seeing multiple restarts logged, although this time I am sometimes seeing No heartbeat from client for 30 sec message along with it. I didn't see that in the beta. [03:17:19]: Computing pass 0 03:28:00 (6056): No heartbeat from client for 30 sec - exiting 03:28:00 (6056): timer handler: client dead, exiting Cheers ![]() ![]() |
||
|
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges:
|
I have received a number of re-sends (33) with a deadline of about 21hrs 30 mins all on one slowish 16 thread rig.
----------------------------------------In view of the fact that a couple that are already running will take around 12hs to complete and there are 17 waiting to run it seems that at the very least 1 of these will likely time out. sure others will finish in 5 hours but this seems to be cutting things a bit fine. http://www.lakecityquietpills.com/photo/multi.../62455213143194693020.png suggestion: next time send them to one of my bigger rigs ![]() Joking aside, is the system that sends these out capable of doing the math? and rig performance aware? (maybe I am concerned when I needn't be) Notice that the single priority wu at the bottom of the pic is alone on the bigger rig EDIT:less than an hour to go and.... http://www.lakecityquietpills.com/photo/multi.../86472525818184331698.png no time to finish 2 wu's ![]() ![]() [Edit 1 times, last edit by OldChap at Nov 12, 2013 5:00:42 PM] |
||
|
|
|