| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 63
|
|
| Author |
|
|
Curly2
Cruncher Joined: Nov 24, 2008 Post Count: 10 Status: Offline Project Badges:
|
Question: Running at 100% CPU time? What OS? What Client version? Core 2Duo with 2GB RAM running Win XP 32bit, BOINC version is 6.10.18. The task with high memory usage was down to 0% CPU usage after ~2min runtime. Since restarting BOINC everything's running fine with a little babysitting (partially load of ~2.2GB RAM on the host). Now the task uses up to 600MB RAM (2:20 hrs runtime). Edit: Running at 100% CPU time (set in the device profile). ![]() [Edit 1 times, last edit by Curly2 at Oct 31, 2011 1:50:06 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Have the scientist behind this project just released a second workunit type? The reason I'm asking is that when I look at the processes I currently have running, I seem to have 2 different DSFL workunit types. There seems to be the standard wcg_dsfl_6.19_indows_intelx86 workunit, as well as a wcg_dsfl_vina_6.19_windows_interx86 workunit which I can't remember having seen before. And it is the latter workunit that is hogging the system memory, currently 800MB on the machine that I am typing this on. Mine is target 50 batch 08. The science app has *always* been 2 parts, a stager or controller and a worker. One uses no CPU time, few seconds during the job, the other does it all. For CEP2 is a same structure with 3 or 4 parts, depending on platform. --//-- |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Question: Running at 100% CPU time? What OS? What Client version? Core 2Duo with 2GB RAM running Win XP 32bit, BOINC version is 6.10.18. The task with high memory usage was down to 0% CPU usage after ~2min runtime. Since restarting BOINC everything's running fine with a little babysitting (partially load of ~2.2GB RAM on the host). Now the task uses up to 600MB RAM (2:20 hrs runtime). 2 out of 3 answered. 1) Question: Running at 100% CPU time (or some lower percent set in the device profile or local prefs) or *Run Always* as selected in the activity menu of BOINC. Stuck WU is a different issue we know about (when not 100% set for BOINC) than hi memory use. Whilst, a task could also have been paused by BOINC if it has used more memory during idle/use than specified. --//-- [Edit 2 times, last edit by Former Member at Oct 31, 2011 1:17:48 PM] |
||
|
|
BSD
Senior Cruncher Joined: Apr 27, 2011 Post Count: 224 Status: Offline |
OS: Win 7 x64
----------------------------------------RAM: 4 GB DDR3 CPU: Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz - HT on Projects selected: CEP2 max 3 WU, DSFL Client: 6.12.34 for windows_x86_64 This device only picked up DFSL this morning batch 50 and started running slowly, very un-responsive. These WUs restart with typical memory size, then start to go to ~1 GB, device physical memory goes to 100%, then the WUs restart over again. I'll think I'll suspend about half WU until they get processed, I've deselected this project in the mean time. Here's the last part of my log: 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_0260_1 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_0503_0 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_0792_0 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_0977_0 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_1070_0 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_0274_0 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_1088_0 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Task DSFL_00000050_0000024_0819_1 exited with zero status but no 'finished' file 10/31/2011 9:05:26 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:05:26 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0260_1 using dsfl version 619 10/31/2011 9:05:26 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0503_0 using dsfl version 619 10/31/2011 9:05:38 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0792_0 using dsfl version 619 10/31/2011 9:05:38 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0977_0 using dsfl version 619 10/31/2011 9:05:38 AM | World Community Grid | Restarting task DSFL_00000050_0000024_1070_0 using dsfl version 619 10/31/2011 9:05:38 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0274_0 using dsfl version 619 10/31/2011 9:05:38 AM | World Community Grid | Restarting task DSFL_00000050_0000024_1088_0 using dsfl version 619 10/31/2011 9:05:38 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0819_1 using dsfl version 619 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_0260_1 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_0503_0 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_0792_0 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_0977_0 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_1070_0 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_0274_0 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_1088_0 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Task DSFL_00000050_0000024_0819_1 exited with zero status but no 'finished' file 10/31/2011 9:08:12 AM | World Community Grid | If this happens repeatedly you may need to reset the project. 10/31/2011 9:08:12 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0260_1 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0503_0 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0792_0 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0977_0 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_1070_0 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0274_0 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_1088_0 using dsfl version 619 10/31/2011 9:08:25 AM | World Community Grid | Restarting task DSFL_00000050_0000024_0819_1 using dsfl version 619 EDIT: Added client version and spelling. [Edit 2 times, last edit by BSD at Oct 31, 2011 1:26:25 PM] |
||
|
|
BSD
Senior Cruncher Joined: Apr 27, 2011 Post Count: 224 Status: Offline |
I suspended 4 of the 8 I have. 3 show status "Running" at ~1GB in memory size (no restarts in Event log, Progress % continuing), the remaining one is status "Waiting to run" at ~ <1GB memory.
----------------------------------------[Edit 1 times, last edit by BSD at Oct 31, 2011 1:32:37 PM] |
||
|
|
GreatWelshOptimist
Cruncher Joined: Jan 20, 2009 Post Count: 3 Status: Offline Project Badges:
|
This issue does not seem to be isolated to batch 50 as I have a couple of machines running batch 49 and they are showing the same high memory usage problem.
From pesonal experience it seems that you need sufficient memory (about 1GB) for each of the DFSL tasks that you have running, and is also dependant on your memory usage settings. I'm currently using the default values of 50% (in use) and 90% (idle) on all my systems, and as long as the system is not in use then all of my DFSL tasks will run. This drops to 50% when the system is in use due to insufficient system memory allocation. Hopefully this will help a bit, but it doesn't explain why the memory usage has shot through the roof since batch 49. |
||
|
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges:
|
I just noticed the high memory usage also (800 MB to 1 GB per each DSFL work unit). For my quad-core with 8 GB DRAM, I don't see any problems yet; the first one is 85% complete, and there are two more at 80% and 34%. But you probably don't want it to run in virtual memory, so maybe that BOINC setting should be changed to prevent it.
|
||
|
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 328 Status: Offline Project Badges:
|
OS - vista
RAM - 2Gb core 2 duo boinc - 6.10.58 The working set size was 232Mb for batch 49 but 874Mb for batch 50. The batch 50 is a repair unit and started immediately as high priority. BOINC then tried 3 other tasks in the queue but has left them as 'waiting to run, suspended' so is running only one work unit with the second core left idle. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm currently running 2 batch 50 WUs on my duo and had noticed large (890MB) memory usage but no real problems. Just now I saw BOINC saying that one of the tasks was in status "waiting for memory" (I've an empty cache at the moment so nothing else can start). I now see one of the tasks taking 900MB and the other over 1000MB. I checked the stderr.txt files and saw a series of "Quit requested: Exiting" messages. I bumped the amount of memory that BOINC could use while the machine was in use up a bit and now all seems to be running smoothly again. (I have a 3GB machine.) In all cases VM size seems to be only 2MB less than memory (or peak memory) usage. I would not like to be running these WUs on a smaller machine!
I hope the techs can find a way to reduce the memory usage, or we might be in a situation like CEP2 where we need a means to identify machines capable of running these, and a way to control how many run at the same time... |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just notice that there are a series of files in the slots directory with names like "dsfl.target_00000050__1581432322_log.txt" all of which contain messages which start with:
WARNING: The search space volume > 27000 Angstrom^3 (See FAQ) Could this explain the issues? (I haven't found the referred to FAQ entry yet, so if anyone can help...) |
||
|
|
|