Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 264
Posts: 264   Pages: 27   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 79464 times and has 263 replies Next Thread
yoro42
Ace Cruncher
United States
Joined: Feb 19, 2011
Post Count: 8979
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

Those feeling the urge to abort, BEFORE aborting visit the proper slot [select task, hit properties tells which slot], then take a copy of the stderr.txt. Logs that come from an abort have lots missing, and include that MS file dump, which is just the result of the abort.




Computer: Bird
Project World Community Grid

Name MCM1_0000006_1183_0

Application mcm1 7.24
Workunit name MCM1_0000006_1183
State Running
Received 11/8/2013 8:02:42 PM
Report deadline 11/18/2013 8:02:19 PM
Estimated app speed 2.56 GFLOPs/sec
Estimated task size 49,000 GFLOPs
CPU time at last checkpoint 00:00:00
CPU time 04:29:59
Elapsed time 04:52:42
Estimated time remaining 00:00:00
Fraction done 100.000%
Virtual memory size 47.43 MB
Working set size 49.45 MB
Directory slots/12
Process ID 2332


World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM 04:57:42 (04:34:50) - Running 100.00 92.321 [0] 04:34:50 Bird 11/18/2013 8:02:19 PM 49.45 MB 47.43 MB

Computer: Bird
Project World Community Grid

Name MCM1_0000006_1183_0

Application mcm1 7.24
Workunit name MCM1_0000006_1183
State Waiting to run, Suspended by user
Received 11/8/2013 8:02:42 PM
Report deadline 11/18/2013 8:02:19 PM
Estimated app speed 2.56 GFLOPs/sec
Estimated task size 49,000 GFLOPs
CPU time at last checkpoint 00:00:00
CPU time 04:37:19
Elapsed time 05:00:16
Estimated time remaining 00:00:00
Fraction done 100.000%
Virtual memory size 47.43 MB
Working set size 49.45 MB
Directory slots/12
Process ID 2332
World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM 05:00:16 (04:37:19) - Waiting to run 100.00 92.358 [0] 04:37:19 Bird 11/18/2013 8:02:19 PM

Computer: Bird
Project World Community Grid

Name MCM1_0000006_1183_0

Application mcm1 7.24
Workunit name MCM1_0000006_1183
State Waiting to run
Received 11/8/2013 8:02:42 PM
Report deadline 11/18/2013 8:02:19 PM
Estimated app speed 2.56 GFLOPs/sec
Estimated task size 49,000 GFLOPs
CPU time at last checkpoint 00:00:00
CPU time 04:37:19
Elapsed time 05:00:16
Estimated time remaining 00:00:00
Fraction done 100.000%
Virtual memory size 47.43 MB
Working set size 49.45 MB
Directory slots/12
Process ID 2332

World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM 05:00:30 (04:37:33) - Running 100.00 92.362 [0] 04:37:33 Bird 11/18/2013 8:02:19 PM 49.45 MB 47.43 MB

Computer: Bird
Project World Community Grid

Name MCM1_0000006_1183_0

Application mcm1 7.24
Workunit name MCM1_0000006_1183
State Running
Received 11/8/2013 8:02:42 PM
Report deadline 11/18/2013 8:02:19 PM
Estimated app speed 2.56 GFLOPs/sec
Estimated task size 49,000 GFLOPs
CPU time at last checkpoint 00:00:00
CPU time 04:37:55
Elapsed time 05:00:53
Estimated time remaining 00:00:00
Fraction done 100.000%
Virtual memory size 47.43 MB
Working set size 49.45 MB
Directory slots/12
Process ID 2332

Computer: Bird
Project World Community Grid

Name MCM1_0000006_1183_0

Application mcm1 7.24
Workunit name MCM1_0000006_1183
State Aborted
Received 11/8/2013 8:02:42 PM
Report deadline 11/18/2013 8:02:19 PM
Estimated app speed 2.56 GFLOPs/sec
Estimated task size 49,000 GFLOPs
CPU time at last checkpoint 00:00:00
CPU time 04:39:38
Elapsed time 05:02:37
Estimated time remaining 00:00:00
Fraction done 100%
Virtual memory size 0.00 MB
Working set size 0.00 MB

World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM05:02:37 (04:39:38) -Aborted (203) 100.00 92.408 Bird 11/18/2013 8:02:19 PM

Will provide additional info when WGG stats update finishes...
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by yoro42 at Nov 11, 2013 1:25:08 AM]
[Nov 11, 2013 1:21:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
yoro42
Ace Cruncher
United States
Joined: Feb 19, 2011
Post Count: 8979
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

World Community Grid 7.24 mcm1 MCM1_0000006_1183_0 11/8/2013 8:02:42 PM05:02:37 (04:39:38) -Aborted (203) 100.00 92.408 Bird 11/18/2013 8:02:19 PM

Will provide additional info when WGG stats update finishes...


Result Log

Result Name: MCM1_ 0000006_ 1183_ 0--
<core_client_version>7.2.28</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
<stderr_txt>
Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running
Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running
Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running
Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running
Commandline = projects/www.worldcommunitygrid.org/wcgrid_mcm1_7.24_windows_x86_64 -SettingsFile MCM1_0000006_1183.txt -DatabaseFile dataset-17_72_SDG_v1.txt
Initializing
wcg_learn_limit = 500000
Running


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x000007FEFD363CA2

Engaging BOINC Windows Runtime Debugger...



********************


BOINC Windows Runtime Debugger Version 7.2.4


Dump Timestamp : 11/10/13 18:05:16
Install Directory : C:\Program Files\BOINC\
Data Directory : C:\ProgramData\BOINC
Project Symstore :
Loaded Library : C:\Program Files\BOINC\\dbghelp.dll
Loaded Library : C:\Program Files\BOINC\\symsrv.dll
Loaded Library : C:\Program Files\BOINC\\srcsrv.dll
LoadLibraryA( C:\Program Files\BOINC\\version.dll ): GetLastError = 126
Loaded Library : version.dll
Debugger Engine : 4.0.5.0
Symbol Search Path: C:\ProgramData\BOINC\slots\12;C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org


ModLoad: 000000003fd00000 00000000001ad000 C:\ProgramData\BOINC\projects\www.worldcommunitygrid.org\wcgrid_mcm1_7.24_windows_x86_64 (-exported- Symbols Loaded)
Linked PDB Filename : c:\projects\wcgridAustinWorkspace\scienceApps\MCM1\x64\Release\wcgrid_mcm1_prod_64.pdb

ModLoad: 00000000771d0000 00000000001a9000 C:\Windows\SYSTEM32\ntdll.dll (6.1.7601.18247) (-exported- Symbols Loaded)
Linked PDB Filename : ntdll.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 0000000076fb0000 000000000011f000 C:\Windows\system32\kernel32.dll (6.1.7601.18229) (-exported- Symbols Loaded)
Linked PDB Filename : kernel32.pdb
File Version : 6.1.7601.18015 (win7sp1_gdr.121129-1432)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7601.18015

ModLoad: 00000000fd330000 000000000006b000 C:\Windows\system32\KERNELBASE.dll (6.1.7601.18229) (-exported- Symbols Loaded)
Linked PDB Filename : kernelbase.pdb
File Version : 6.1.7601.18015 (win7sp1_gdr.121129-1432)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7601.18015

ModLoad: 00000000fc0c0000 000000000000c000 C:\Windows\system32\VERSION.dll (6.1.7600.16385) (-exported- Symbols Loaded)
Linked PDB Filename : version.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fe960000 000000000009f000 C:\Windows\system32\msvcrt.dll (7.0.7601.17744) (-exported- Symbols Loaded)
Linked PDB Filename : msvcrt.pdb
File Version : 7.0.7601.17744 (win7sp1_gdr.111215-1535)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 7.0.7601.17744

ModLoad: 0000000077390000 0000000000007000 C:\Windows\system32\PSAPI.DLL (6.1.7600.16385) (-exported- Symbols Loaded)
Linked PDB Filename : psapi.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000770d0000 00000000000fa000 C:\Windows\system32\USER32.dll (6.1.7601.17514) (-exported- Symbols Loaded)
Linked PDB Filename : user32.pdb
File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7601.17514

ModLoad: 00000000fe770000 0000000000067000 C:\Windows\system32\GDI32.dll (6.1.7601.17514) (-exported- Symbols Loaded)
Linked PDB Filename : gdi32.pdb
File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7601.17514

ModLoad: 00000000feb60000 000000000000e000 C:\Windows\system32\LPK.dll (6.1.7601.18177) (-exported- Symbols Loaded)
Linked PDB Filename : lpk.pdb
File Version : 6.1.7601.18177 (win7sp1_gdr.130605-1534)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7601.18177

ModLoad: 00000000feb70000 00000000000c9000 C:\Windows\system32\USP10.dll (1.626.7601.18009) (-exported- Symbols Loaded)
Linked PDB Filename : usp10.pdb
File Version : 1.0626.7601.18009 (win7sp1_gdr.121121-1431)
Company Name : Microsoft Corporation
Product Name : Microsoft(R) Uniscribe Unicode script processor
Product Version : 1.0626.7601.18009

ModLoad: 00000000fe7e0000 00000000000db000 C:\Windows\system32\ADVAPI32.dll (6.1.7601.18247) (-exported- Symbols Loaded)
Linked PDB Filename : advapi32.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fe750000 000000000001f000 C:\Windows\SYSTEM32\sechost.dll (6.1.7600.16385) (-exported- Symbols Loaded)
Linked PDB Filename : sechost.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fec40000 000000000012d000 C:\Windows\system32\RPCRT4.dll (6.1.7601.18205) (-exported- Symbols Loaded)
Linked PDB Filename : rpcrt4.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fd4f0000 0000000000d88000 C:\Windows\system32\SHELL32.dll (6.1.7601.18222) (-exported- Symbols Loaded)
Linked PDB Filename : shell32.pdb
File Version : 6.1.7601.17514 (win7sp1_rtm.101119-1850)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7601.17514

ModLoad: 00000000fe6d0000 0000000000071000 C:\Windows\system32\SHLWAPI.dll (6.1.7601.17514) (-exported- Symbols Loaded)
Linked PDB Filename : shlwapi.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fe360000 000000000002e000 C:\Windows\system32\IMM32.DLL (6.1.7600.16385) (-exported- Symbols Loaded)
Linked PDB Filename : imm32.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fd3e0000 0000000000109000 C:\Windows\system32\MSCTF.dll (6.1.7600.16385) (-exported- Symbols Loaded)
Linked PDB Filename : msctf.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000faba0000 000000000002d000 C:\Windows\system32\ntmarta.dll (6.1.7600.16385) (-exported- Symbols Loaded)
Linked PDB Filename : ntmarta.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 00000000fe670000 0000000000052000 C:\Windows\system32\WLDAP32.dll (6.1.7601.17514) (-exported- Symbols Loaded)
Linked PDB Filename : wldap32.pdb
File Version : 6.1.7600.16385 (win7_rtm.090713-1255)
Company Name : Microsoft Corporation
Product Name : Microsoft® Windows® Operating System
Product Version : 6.1.7600.16385

ModLoad: 0000000072580000 000000000015e000 C:\Program Files\BOINC\dbghelp.dll (6.8.4.0) (-exported- Symbols Loaded)
Linked PDB Filename : dbghelp.pdb
File Version : 6.8.0004.0 (debuggers(dbg).070519-0745)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 00000000730a0000 000000000004e000 C:\Program Files\BOINC\symsrv.dll (6.8.4.0) (-exported- Symbols Loaded)
Linked PDB Filename : symsrv.pdb
File Version : 6.8.0004.0 (debuggers(dbg).070519-0745)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0

ModLoad: 0000000073060000 000000000003e000 C:\Program Files\BOINC\srcsrv.dll (6.8.4.0) (-exported- Symbols Loaded)
Linked PDB Filename : srcsrv.pdb
File Version : 6.8.0004.0 (debuggers(dbg).070519-0745)
Company Name : Microsoft Corporation
Product Name : Debugging Tools for Windows(R)
Product Version : 6.8.0004.0



*** Dump of the Process Statistics: ***

- I/O Operations Counters -
Read: 0, Write: 0, Other 0

- I/O Transfers Counters -
Read: 0, Write: 0, Other 0

- Paged Pool Usage -
QuotaPagedPoolUsage: 0, QuotaPeakPagedPoolUsage: 0
QuotaNonPagedPoolUsage: 0, QuotaPeakNonPagedPoolUsage: 0

- Virtual Memory Usage -
VirtualSize: 0, PeakVirtualSize: 0

- Pagefile Usage -
PagefileUsage: 0, PeakPagefileUsage: 0

- Working Set Size -
WorkingSetSize: 0, PeakWorkingSetSize: 0, PageFaultCount: 0

*** Dump of thread ID 7272 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x000007FEFD363CA2

- Registers -
rax=0000000000000000 rbx=0000000000000000 rcx=000000003fe18950 rdx=000000003fe18948 rsi=0000000000000000 rdi=0000000000000000
r8=000000000225f210 r9=00000000c24468d0 r10=000000003fe18940 r11=0000000000000000 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000 rip=00000000fd363ca2 rsp=000000000225f1d8 rbp=0000000000000000
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246

- Callstack -
ChildEBP RetAddr Args to Child
0225f1d0 3fd94cc3 3fe18950 3fe18948 0225f210 c24468d0 KERNELBASE!DebugBreak+0x0
0225f620 3fd93ec1 035950d4 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0225f880 3fd93db0 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0225f8b0 76fc652d 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0225f8e0 771fc541 00000000 00000000 00000000 00000000 kernel32!BaseThreadInitThunk+0x0
0225f930 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0

*** Dump of thread ID 2008 (state: Initialized): ***

- Information -
Status: Base Priority: Normal, Priority: Normal, , Kernel Time: 0.000000, User Time: 0.000000, Wait Time: 0.000000

- Registers -
rax=00000000000000b7 rbx=0000000000002953 rcx=0000000000000003 rdx=0000000000000000 rsi=0000000001ca7d58 rdi=00000000000000b7
r8=00000000fffde000 r9=0000000000720006 r10=000000006c706572 r11=000000000027f170 r12=0000000000000c9c r13=0000000001ca7d58
r14=0000000000000006 r15=000000000000000b rip=0000000076fd34b0 rsp=000000000027f0b8 rbp=000000000228ac80
cs=0033 ss=002b ds=0000 es=0000 fs=0000 gs=0000 efl=00000206

- Callstack -
ChildEBP RetAddr Args to Child
0027f0b0 3fdac312 00000000 3fda24a3 00000002 00000008 kernel32!FlsGetValue+0x0
0027f0e0 3fdac383 00002953 3fda041b 00000008 0027f3a0 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0027f110 3fda1e95 0000000b 00000004 01ca7d90 00004c54 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0027f140 3fd0b771 01ca7d40 00000004 0228ac80 3fda24a3 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0027f250 3fd249b6 00000001 0027f2f8 0027f2b0 0027f2b4 wcgrid_mcm1_7!+0x0
0027f380 3fd2443e 01e58b00 0027f3e0 00000001 0000000b wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0
0027f480 3fd24256 0000000f 01e55be0 00000005 00000005 wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0
0027f540 3fd45124 3fe664a0 01e55be0 00000005 0000000c wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0
0027f770 3fda27cb 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!boost::serialization::singleton<boost::archive::detail::iserializer<boost::archive::binary_iarchive,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > >::get_const_instance+0x0
0027f7b0 76fc652d 00000000 00000000 00000000 00000000 wcgrid_mcm1_7!DeleteMulticlassModel+0x0
0027f7e0 771fc541 00000000 00000000 00000000 00000000 kernel32!BaseThreadInitThunk+0x0
0027f830 00000000 00000000 00000000 00000000 00000000 ntdll!RtlUserThreadStart+0x0


*** Debug Message Dump ****


*** Foreground Window Data ***
Window Name :
Window Class :
Window Process ID: 0
Window Thread ID : 0

Exiting...

</stderr_txt>
]]>
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by yoro42 at Nov 11, 2013 2:45:11 AM]
[Nov 11, 2013 2:43:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
cowtipperbs
Advanced Cruncher
Joined: Aug 24, 2009
Post Count: 78
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

Thanks for your response Lawrence, the amount of memory should not be the problem, running 16 gig. The last time I looked was only using around 6 gig.
----------------------------------------

[Nov 11, 2013 7:18:10 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

yoro42,

The properties screen posts the slot info [bolded below], a sub directory of the BOINC data dir, which path is printed in the BOINC client startup messages. This is where the looked for stderr.txt file is. This file contains the Result log up to the point in time and could tell what's on with a [suspect] task.
Name MCM1_0000006_1183_0

Application mcm1 7.24
Workunit name MCM1_0000006_1183
State Running
Received 11/8/2013 8:02:42 PM
Report deadline 11/18/2013 8:02:19 PM
Estimated app speed 2.56 GFLOPs/sec
Estimated task size 49,000 GFLOPs
CPU time at last checkpoint 00:00:00
CPU time 04:37:55
Elapsed time 05:00:53
Estimated time remaining 00:00:00
Fraction done 100.000%
Virtual memory size 47.43 MB
Working set size 49.45 MB
Directory slots/12
Process ID 2332


The 'aborted by user' log shows multiple initializations of the task. A clean run task without intervention would show it only once. This stderr.txt viewing, combined with what may be additive in the message/event log, will help to decide to what to do after observation:

A) Do a "task suspend > wait a minute > resume" cycle with LAIM off (Leave application in memory when suspended).
B) Soft boot cycle (OS Restart)
C) Send it to never never land

If you know all this, consider it for the benefit of other readers who don't... which is what most of my posts are meant to do :D
[Nov 11, 2013 8:41:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

I have on resent Beta 7.24 running for a very long time. Posting it here, because it should be the same version as current MCM1 public application.

BETA_ BETA_ 9999961_ 0096_ 2-- - In Progress 11/9/13 13:29:41 11/11/13 13:29:41 0.00 0.0 / 0.0 <<moi
BETA_ BETA_ 9999961_ 0096_ 0-- - No Reply 11/7/13 13:29:28 11/9/13 13:29:28 0.00 0.0 / 0.0
BETA_ BETA_ 9999961_ 0096_ 1-- 724 Pending Validation 11/7/13 13:29:27 11/9/13 04:16:50 38.04 407.8 / 0.0

CPU time is now 21.45 h. Progress 73.520 %, last checkpoint 19 hours ago. I think this is also how long the progress bar hasn't moved. The CPU time is counting up, which was not the case in the 7.21 Beta Windows error. I'm going to let it run for now.

Specs: Linux Lubuntu 64-bit, 7.0.27 64-bit client, processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual-Core Processor TK-53 [Family 15 Model 104 Stepping 1]. Other MCM1 Betas took 7-9 hours on this notebook.


Still nothing from this work unit.
(info from properties+BoincTasks)

est. computation size: 60506 GFLOPs
virt. mem. size: 85.60 MB
working set size: 43.61 MB
application: 7.24 Beta Test
name: BETA_BETA_9999961_0096_2
elapsed (cpu): 02d,00:38:39 (01d,22:41:13) (is still incrementing)
last checkpoint at: 02:07:00
time since last checkpoint: [1] 01d,20:34:12
cpu%: 95,98
progress%: 73,520 (not incrementing)
deadline: 17:11:43 11-11-2013 14:29 (which means overdue)
state: Running High P.

BETA_ BETA_ 9999961_ 0096_ 3-- - In Progress 11/11/13 13:29:48 11/13/13 13:29:48 0.00 0.0 / 0.0
BETA_ BETA_ 9999961_ 0096_ 2-- - No Reply 11/9/13 13:29:41 11/11/13 13:29:41 0.00 0.0 / 0.0
BETA_ BETA_ 9999961_ 0096_ 0-- - No Reply 11/7/13 13:29:28 11/9/13 13:29:28 0.00 0.0 / 0.0
BETA_ BETA_ 9999961_ 0096_ 1-- 724 Pending Validation 11/7/13 13:29:27 11/9/13 04:16:50 38.04 407.8 / 0.0

Can't copy/paste stderr.txt from here, it shows computing passes 0 and 1.
Still let run like there's no tomorrow? In terms of deadline, there already is none.
[Nov 11, 2013 2:27:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

We are looking into the issues reported in this thread. We will let everyone know once we discover more. For the workunits continuing to run after showing 100% complete I would recomend letting those continue to run if they are continuing to increase in CPU time. It is likely just an issue with how we are calculating the percent complete. Also you can check the result status on the website to see if any wingmen have completed and the cpu time used. If you have concerns or they are not using CPU time you can try a suspend/resume and finally an abort.

Thanks,
armstrdj
[Nov 11, 2013 7:31:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread


Still nothing from this work unit.
(info from properties+BoincTasks)

est. computation size: 60506 GFLOPs
virt. mem. size: 85.60 MB
working set size: 43.61 MB
application: 7.24 Beta Test
name: BETA_BETA_9999961_0096_2
elapsed (cpu): 02d,00:38:39 (01d,22:41:13) (is still incrementing)
last checkpoint at: 02:07:00
time since last checkpoint: [1] 01d,20:34:12
cpu%: 95,98
progress%: 73,520 (not incrementing)
deadline: 17:11:43 11-11-2013 14:29 (which means overdue)
state: Running High P.


Update: After almost 53 hour of computation time and 2 days without any progress, the progress is now slowly moving forwards from time to time. I expect this work unit to be finished within the next few hours. Nothing new in stderr.txt and no checkpoints were set since then.

Edit: Completed faster than expected:

BETA_ BETA_ 9999961_ 0096_ 3-- - In Progress 11/11/13 13:29:48 11/13/13 13:29:48 0.00 0.0 / 0.0
BETA_ BETA_ 9999961_ 0096_ 2-- 724 Valid 11/9/13 13:29:41 11/11/13 20:22:00 52.73 140.1 / 273.9
BETA_ BETA_ 9999961_ 0096_ 0-- - No Reply 11/7/13 13:29:28 11/9/13 13:29:28 0.00 0.0 / 0.0
BETA_ BETA_ 9999961_ 0096_ 1-- 724 Valid 11/7/13 13:29:27 11/9/13 04:16:50 38.04 407.8 / 273.9


Result Name: BETA_ BETA_ 9999961_ 0096_ 1--
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_beta17_7.24_x86_64-pc-linux-gnu -SettingsFile BETA_9999961_0096.txt -DatabaseFile dataset-GDS2771-v1.txt
Initializing
wcg_learn_limit = 750000
Running
[15:46:04]: Computing pass 0
16:31:23 (6059): No heartbeat from client for 30 sec - exiting
16:31:23 (6059): timer handler: client dead, exiting
Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_beta17_7.24_x86_64-pc-linux-gnu -SettingsFile BETA_9999961_0096.txt -DatabaseFile dataset-GDS2771-v1.txt
Initializing
wcg_learn_limit = 750000
Running
[16:32:02]: Computing pass 0
INFO: WcgLearnLimit(750000) reached. 0.0017568562425581 0.0017568562425279
[17:33:10]: Computing pass 1
Result.out = 108486.000000
Run complete, CPU time: 136929.312595
06:32:50 (6118): called boinc_finish

</stderr_txt>
]]>

Result Name: BETA_ BETA_ 9999961_ 0096_ 2--
<core_client_version>7.0.27</core_client_version>
<![CDATA[
<stderr_txt>
Commandline = ../../projects/www.worldcommunitygrid.org/wcgrid_beta17_7.24_x86_64-pc-linux-gnu -SettingsFile BETA_9999961_0096.txt -DatabaseFile dataset-GDS2771-v1.txt
Initializing
wcg_learn_limit = 750000
Running
[14:30:50]: Computing pass 0
INFO: WcgLearnLimit(750000) reached. 0.0017568562425581 0.0017568562425279
[16:48:32]: Computing pass 1
Result.out = 108486.000000
Run complete, CPU time: 189817.930879
21:21:03 (23395): called boinc_finish

</stderr_txt>
]]>
----------------------------------------
[Edit 1 times, last edit by Former Member at Nov 11, 2013 8:25:49 PM]
[Nov 11, 2013 8:05:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

For me this is day one of running "normal" priority work and I am not reporting a problem, rather reiterating the fact that these are a disperate bunch of wu's, some short and some very long. I have one on target for 28 hours runtime here. that is on 2.4gig E5 26xx on linux 64.

Just sayin'.
----------------------------------------

[Nov 11, 2013 8:47:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
NixChix
Veteran Cruncher
United States
Joined: Apr 29, 2007
Post Count: 1187
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

I am still seeing discrepancies between the reported elapsed time and the actual times from the logs, upto 10 hours, but not as large as I saw in the beta tests.

I also am still seeing multiple restarts logged, although this time I am sometimes seeing No heartbeat from client for 30 sec message along with it. I didn't see that in the beta.
[03:17:19]: Computing pass 0
03:28:00 (6056): No heartbeat from client for 30 sec - exiting
03:28:00 (6056): timer handler: client dead, exiting

Cheers coffee
----------------------------------------

[Nov 11, 2013 9:53:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Mapping Cancer Markers - Problems Thread

I have received a number of re-sends (33) with a deadline of about 21hrs 30 mins all on one slowish 16 thread rig.

In view of the fact that a couple that are already running will take around 12hs to complete and there are 17 waiting to run it seems that at the very least 1 of these will likely time out. sure others will finish in 5 hours but this seems to be cutting things a bit fine.

http://www.lakecityquietpills.com/photo/multi.../62455213143194693020.png

suggestion: next time send them to one of my bigger rigs smile

Joking aside, is the system that sends these out capable of doing the math? and rig performance aware? (maybe I am concerned when I needn't be)

Notice that the single priority wu at the bottom of the pic is alone on the bigger rig

EDIT:less than an hour to go and.... http://www.lakecityquietpills.com/photo/multi.../86472525818184331698.png no time to finish 2 wu's sad
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by OldChap at Nov 12, 2013 5:00:42 PM]
[Nov 11, 2013 11:31:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 264   Pages: 27   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread