Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 992 times and has 8 replies Next Thread
smiley7804
Cruncher
Joined: Dec 8, 2012
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Project file crunching restarts after X%

What is wrong when the chrunching restarts after X% done?
I never restart the PC nor shutting it down. What i know I have Everything set so that it shall work all the time 24/7.
[Oct 24, 2013 10:54:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

Hello smiley7804,
Please give us some information to look at. Please paste the first 50 or so lines from your Messages after a boot and tell us which project causes problems on your system. How long does it take before it drops to 0 %? Do you have Leave Application In Memory set to ON so that the project does not drop back to the last check point if preempted by another program?

Lawrence
[Oct 25, 2013 1:43:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
smiley7804
Cruncher
Joined: Dec 8, 2012
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

I get an error when i wanna reply. Yes i have "Leave application in memory " on.

2013-10-25 10:10:37 | World Community Grid | Computation for task FAHV_x4DQG_disulfide_B_0156657_0005_0 finished
2013-10-25 10:10:37 | World Community Grid | Resuming task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 using cep2 version 640 in slot 5
2013-10-25 10:10:40 | World Community Grid | Started upload of FAHV_x4DQG_disulfide_B_0156657_0005_0_0
2013-10-25 10:10:44 | World Community Grid | Finished upload of FAHV_x4DQG_disulfide_B_0156657_0005_0_0
2013-10-25 10:10:47 | World Community Grid | Sending scheduler request: To report completed tasks.
2013-10-25 10:10:47 | World Community Grid | Reporting 1 completed tasks
2013-10-25 10:10:47 | World Community Grid | Not requesting tasks: "no new tasks" requested via Manager
2013-10-25 10:11:02 | World Community Grid | Scheduler request completed
2013-10-25 10:12:17 | World Community Grid | task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 suspended by user
2013-10-25 10:12:42 | World Community Grid | task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 resumed by user
2013-10-25 10:12:43 | World Community Grid | Resuming task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 using cep2 version 640 in slot 5
2013-10-25 10:12:45 | World Community Grid | task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 suspended by user
2013-10-25 10:14:34 | World Community Grid | task E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0 suspended by user
2013-10-25 10:17:24 | World Community Grid | task E216468_485_I.42.C32H12N8S2.00094642.2.set1d06_0 suspended by user
2013-10-25 10:21:27 | World Community Grid | task E216468_485_I.42.C32H12N8S2.00094642.2.set1d06_0 resumed by user
2013-10-25 10:21:27 | World Community Grid | Resuming task E216468_485_I.42.C32H12N8S2.00094642.2.set1d06_0 using cep2 version 640 in slot 2
2013-10-25 10:22:05 | World Community Grid | task E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0 resumed by user
2013-10-25 10:22:07 | World Community Grid | Resuming task E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0 using cep2 version 640 in slot 0
2013-10-25 10:23:06 | World Community Grid | task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 resumed by user
2013-10-25 10:23:07 | World Community Grid | Resuming task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 using cep2 version 640 in slot 7
2013-10-25 10:24:45 | World Community Grid | task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 suspended by user
2013-10-25 10:30:45 | World Community Grid | Computation for task E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0 finished
2013-10-25 10:31:15 | World Community Grid | Started upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_0
2013-10-25 10:31:15 | World Community Grid | Started upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_1
2013-10-25 10:31:17 | World Community Grid | Finished upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_0
2013-10-25 10:31:17 | World Community Grid | Started upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_2
2013-10-25 10:31:21 | World Community Grid | Finished upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_1
2013-10-25 10:31:21 | World Community Grid | Started upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_3
2013-10-25 10:31:22 | World Community Grid | Finished upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_2
2013-10-25 10:31:22 | World Community Grid | Finished upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_3
2013-10-25 10:31:22 | World Community Grid | Started upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_4
2013-10-25 10:34:21 | World Community Grid | task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 resumed by user
2013-10-25 10:34:21 | World Community Grid | Resuming task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 using cep2 version 640 in slot 7
2013-10-25 10:55:15 | World Community Grid | Task E216468_480_I.42.C28F4H12N4O4S2.00041217.0.set1d06_0 exited with zero status but no 'finished' file
2013-10-25 10:55:15 | World Community Grid | If this happens repeatedly you may need to reset the project.
2013-10-25 10:55:15 | World Community Grid | Task E216468_485_I.42.C32H12N8S2.00094642.2.set1d06_0 exited with zero status but no 'finished' file
2013-10-25 10:55:15 | World Community Grid | If this happens repeatedly you may need to reset the project.
2013-10-25 10:55:15 | World Community Grid | Task E216469_933_I.43.C30F4H10N8S.00349861.1.set1d06_0 exited with zero status but no 'finished' file
2013-10-25 10:55:15 | World Community Grid | If this happens repeatedly you may need to reset the project.
2013-10-25 10:55:15 | World Community Grid | Task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 exited with zero status but no 'finished' file
2013-10-25 10:55:15 | World Community Grid | If this happens repeatedly you may need to reset the project.
2013-10-25 10:55:15 | World Community Grid | Computation for task E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0 finished
2013-10-25 10:55:50 | World Community Grid | Task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 exited with zero status but no 'finished' file
2013-10-25 10:55:50 | World Community Grid | If this happens repeatedly you may need to reset the project.
2013-10-25 10:55:50 | World Community Grid | Restarting task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 using cep2 version 640 in slot 5
2013-10-25 10:55:50 | World Community Grid | Restarting task E216468_480_I.42.C28F4H12N4O4S2.00041217.0.set1d06_0 using cep2 version 640 in slot 6
2013-10-25 10:55:50 | World Community Grid | Restarting task E216468_485_I.42.C32H12N8S2.00094642.2.set1d06_0 using cep2 version 640 in slot 2
2013-10-25 10:55:50 | World Community Grid | Restarting task E216469_933_I.43.C30F4H10N8S.00349861.1.set1d06_0 using cep2 version 640 in slot 4
2013-10-25 10:55:50 | World Community Grid | Restarting task E216469_928_I.43.C29F6H10N6OS.00414379.0.set1d06_0 using cep2 version 640 in slot 7
2013-10-25 10:55:52 | World Community Grid | Started upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_0
2013-10-25 10:55:54 | World Community Grid | Finished upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_0
2013-10-25 10:55:54 | World Community Grid | Started upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_1
2013-10-25 10:55:59 | World Community Grid | Finished upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_1
2013-10-25 10:55:59 | World Community Grid | Started upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_2
2013-10-25 10:56:06 | World Community Grid | Finished upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_2
2013-10-25 10:56:06 | World Community Grid | Started upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_3
2013-10-25 10:56:07 | World Community Grid | Finished upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_3
2013-10-25 10:56:07 | World Community Grid | Started upload of E216456_039_I.43.C26F6H10N6O5.00347137.2.set1d06_0_4
2013-10-25 10:57:00 | World Community Grid | Finished upload of E216456_452_I.43.C28F10H10N4O.00122543.4.set1d06_0_4
2013-10-25 10:57:54 | World Community Grid | Sending scheduler request: To report completed tasks.
2013-10-25 10:57:54 | World Community Grid | Reporting 1 completed tasks
[Oct 25, 2013 10:02:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

We need the file from right at the beginning please, noit after it has been running for some time biggrin
[Oct 25, 2013 10:05:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
smiley7804
Cruncher
Joined: Dec 8, 2012
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

2013-10-23 21:58:02 | | No config file found - using defaults
2013-10-23 21:58:02 | | Starting BOINC client version 7.0.64 for windows_intelx86
2013-10-23 21:58:02 | | log flags: file_xfer, sched_ops, task
2013-10-23 21:58:02 | | Libraries: libcurl/7.25.0 OpenSSL/1.0.1 zlib/1.2.6
2013-10-23 21:58:02 | | Data directory: H:\ProgramData\BOINC
2013-10-23 21:58:02 | | Running under account Mikael
2013-10-23 21:58:02 | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz [Family 6 Model 58 Stepping 9]
2013-10-23 21:58:02 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes nx lm vmx tm2 pbe
2013-10-23 21:58:02 | | OS: Microsoft Windows 8: Professional x86 Edition, (06.02.9200.00)
2013-10-23 21:58:02 | | Memory: 3.23 GB physical, 4.98 GB virtual
2013-10-23 21:58:02 | | Disk: 232.88 GB total, 165.52 GB free
2013-10-23 21:58:02 | | Local time is UTC +2 hours
2013-10-23 21:58:02 | | No usable GPUs found
2013-10-23 21:58:02 | Docking | URL http://docking.cis.udel.edu/; Computer ID 129511; resource share 100
2013-10-23 21:58:02 | MindModeling@Beta | URL http://mindmodeling.org/; Computer ID 34086; resource share 100
2013-10-23 21:58:02 | World Community Grid | URL http://www.worldcommunitygrid.org/; Computer ID 2287124; resource share 100
2013-10-23 21:58:02 | World Community Grid | General prefs: from World Community Grid (last modified 27-Sep-2013 00:08:21)
2013-10-23 21:58:02 | World Community Grid | Host location: none
2013-10-23 21:58:02 | World Community Grid | General prefs: using your defaults
2013-10-23 21:58:02 | | Reading preferences override file
2013-10-23 21:58:02 | | Preferences:
2013-10-23 21:58:02 | | max memory usage when active: 1651.69MB
2013-10-23 21:58:02 | | max memory usage when idle: 2973.04MB
2013-10-23 21:58:02 | | max disk usage: 169.87GB
2013-10-23 21:58:02 | | max CPUs used: 6
2013-10-23 21:58:02 | | don't use GPU while active
2013-10-23 21:58:02 | | (to change preferences, visit a project web site or select Preferences in the Manager)
2013-10-23 21:58:02 | | Not using a proxy
2013-10-23 21:58:25 | World Community Grid | Restarting task E216421_758_I.39.C34H19N3OS.00311736.4.set1d06_1 using cep2 version 640 in slot 2
2013-10-23 21:58:25 | World Community Grid | Restarting task E216451_154_I.40.C33H18N2O5.00415570.2.set1d06_0 using cep2 version 640 in slot 3
2013-10-23 21:58:25 | World Community Grid | Restarting task E216449_441_I.40.C34F3H18NO2.00068602.4.set1d06_0 using cep2 version 640 in slot 6
2013-10-23 21:58:25 | World Community Grid | Restarting task E216450_498_I.40.C34H18N6.00096716.0.set1d06_0 using cep2 version 640 in slot 0
2013-10-23 21:58:25 | World Community Grid | Restarting task E216451_864_I.39.C36H21NO2.00079040.4.set1d06_0 using cep2 version 640 in slot 1
2013-10-23 21:58:25 | World Community Grid | Restarting task E216450_722_I.40.C32H18N4O4.00080276.4.set1d06_0 using cep2 version 640 in slot 5
2013-10-23 21:59:52 | World Community Grid | task E216421_758_I.39.C34H19N3OS.00311736.4.set1d06_1 suspended by user
2013-10-23 21:59:53 | World Community Grid | Starting task E216449_058_I.40.C36H18N2O2.00212874.1.set1d06_0 using cep2 version 640 in slot 4
2013-10-23 22:01:33 | World Community Grid | General prefs: from World Community Grid (last modified 27-Sep-2013 00:08:21)
2013-10-23 22:01:33 | World Community Grid | Host location: none
2013-10-23 22:01:33 | World Community Grid | General prefs: using your defaults
2013-10-23 22:01:33 | | Reading preferences override file
2013-10-23 22:01:33 | | Preferences:
2013-10-23 22:01:33 | | max memory usage when active: 1651.69MB
2013-10-23 22:01:33 | | max memory usage when idle: 2973.04MB
2013-10-23 22:01:33 | | max disk usage: 170.21GB
2013-10-23 22:01:33 | | max CPUs used: 6
2013-10-23 22:01:33 | | don't use GPU while active
2013-10-23 22:01:33 | | (to change preferences, visit a project web site or select Preferences in the Manager)
2013-10-23 22:02:42 | World Community Grid | Task E216451_154_I.40.C33H18N2O5.00415570.2.set1d06_0 exited with zero status but no 'finished' file
2013-10-23 22:02:42 | World Community Grid | If this happens repeatedly you may need to reset the project.
[Oct 25, 2013 10:08:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

In a nutshell, I think you've got too many CEP2 running concurrent in too little memory. In this situation your system is likely to encounter heavier disk IO / memory to disk swapping, which is when timeouts occur and the restarts show their ugly face.

Since you run with 7.0.64 you can use the app_config.xml feature <max_concurrent>n<max_concurrent> tag [See the cc_config.xml wiki on how to config or search our forums for examples]. Experiment, but suggest you start with limiting it to 3 i.e.

<app_config>
<app>
<name>cep2</name>
<max_concurrent>3</max_concurrent>
</app>
</app_config>
[Oct 25, 2013 11:22:35 AM]   Link   Report threatening or abusive post: please login first  Go to top 
homeslice
Cruncher
USA
Joined: Apr 27, 2007
Post Count: 12
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

I get those errors with CEP2 when I have hyper-threading on.
Turning it off eliminates the errors and my statistics for those machines go way up.
----------------------------------------

[Oct 25, 2013 2:32:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

hmmm, well using the exampled app_config with 4 as <max_concurrent> or however many 'hardware' cores there are is very likely going to achieve the same [50% of total is somewhere a recommended CEP2 limit]. The OS/CPU collaborate to reallocate any spare cycles not used by idling [HT] threads to be put into the CEP2 jobs while retaining full OS capability when using the system. Or, you could run something light on the side [that is, don't choose something integer intense such as prime-seeking].

ATM my octo is limited to 50% of processors with CEP2 only, which is the easiest way to quickly flip/test. The tasks are now finishing properly doing all possible jobs and do not get cut off at the 12 hour mark. Mostly lifts the credit mark, particular if matched with a wingman that does walk into the half day wall [and since there is a higher fail rate due for instance these disk-space exceed jobs, most jobs have a wingman [not very good for CEP2 project efficiency!]. Had one the day before [see other thread], so at least the next 25-35 will have a wingman... until the device has a serial valid for CEP2 of at least 20 [many now in Pending Verification]. Given the lapse rate between returning results and wingman reporting and validating it could turn out even more have to be computed with wingman before the serial 20 valid are reached. Only by limiting cache you can quasi minimize the number... the more you cache, the more will have a pre-assigned wingman as the ferris wheel goes. Resigned to this... not letting me be frustrated by this methodology, though, since I'm writing about it, not tranquil on the matter :|
[Oct 25, 2013 3:13:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
pramo
Veteran Cruncher
USA
Joined: Dec 14, 2005
Post Count: 704
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Project file crunching restarts after X%

haven't been paying close enoug attention but i lost a bunch of efficiency looking back at my totals...
i saw a bunch of restarts on boxes that until a week or so ago had been happily rinning 24 CEP's at a time (12 cores, ht). i've since set them to run 16 CEP instead of unlimited and will see what happens.
Just so the CPU's arent bored i upped my cache to 1 day from .1 to get a bunch of HFAAH, went back to .1 suspended extra CEP and will resume them as appropriate to let them all finish. no restarts in the last 8 hours so maybe this will work

Edited to add- tried the reset project to no avail so went with the above:)
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by pramo at Oct 27, 2013 1:03:51 AM]
[Oct 27, 2013 12:55:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread