| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 11
|
|
| Author |
|
|
Aperture_Science_Innovators
Advanced Cruncher United States Joined: Jul 6, 2009 Post Count: 139 Status: Offline Project Badges:
|
I noticed an issue with one of my crunchers a few weeks back where it was spitting out basically nothing but errors. After some work, I determined that all of the WUs in progress threw errors when I was running "too many" CEP2 WUs on the system at once (more than about ten or twelve).
----------------------------------------I know that running this many WUs isn't optimal for performance (due to IO bottlenecks), so that issue has already been addressed. However, now that I'm confident that it was an issue arising from CEP2, I'm curious why it was happening. Do you guys have any insights? BTW, the system in question is: - 4x AMD Opteron 8350 (quad core, basically a Phenom I X4) - 4GB (8x512MB) RAM - Tyan Quad Socket F board (I forget the exact model) - Hitachi 250GB HDD - Antec 550w PSU - Radeon X1300 All running Windows Server 2008 R2 Enterprise. ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Memory requirement for CEP is about 1 GB PER wu, suspect you are running into a low memory problem
|
||
|
|
Aperture_Science_Innovators
Advanced Cruncher United States Joined: Jul 6, 2009 Post Count: 139 Status: Offline Project Badges:
|
Memory requirement for CEP is about 1 GB PER wu, suspect you are running into a low memory problem A quick check on my laptop would indicate otherwise: Here I'm seeing just about a hundred megabytes/WU (and yes, I realize that I have quite a few of them running at once, but this system has a SSD, so I expect the IO performance to be much better) Memory usage on the AMD 4P seems pretty much constant at 2-2.5GB regardless of what WCG WUs are running. ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Aperture_Science_Innovators,
Speaking of memory, how large is your virtual memory? Task Manager just shows the working set, which is usually a tiny portion of the total virtual memory consumed. Lawrence |
||
|
|
Aperture_Science_Innovators
Advanced Cruncher United States Joined: Jul 6, 2009 Post Count: 139 Status: Offline Project Badges:
|
Hi Aperture_Science_Innovators, Speaking of memory, how large is your virtual memory? Task Manager just shows the working set, which is usually a tiny portion of the total virtual memory consumed. Lawrence I just checked--on the i7 laptop, 6GB, on the AMD 4P, 12GB. ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Aperture_Science_Innovators,
Another good idea shot down! 12 GB virtual memory is about right for the load that Task Manager showed. I suppose that you have Windows set to automatically increase the VM Cache size if required. Which is how you should have it set. Right now I am out of suggestions. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I'm running 4 GFAM/DSFL and the performance tab of the TM says that 3.54GB of physical RAM is committed [overall]. See Firefox gobbling 654MB, what else if you sort the processes to size? What is confusing, to me is, that the screenshot implies there are 6 CEP2 jobs running. That's 'testing', squeezing this into 4GB [WCG specs 1GB per job], meaning very high I/O from RAM to VM [on disk] and storage of the checkpoints too.
|
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
I think you have enough physical memory to run more than 12 CEP2 jobs, though if you're running WCG projects on all 16 cores there won't be much to spare. But I think your problem is something other than amount of memory:
1. When a CEP2 WU starts, it immediately unzips several thousand (!) small data files (refrains from comments re FORTRAN programmers and their mastery of sound computer science principles). This causes a lot of overhead for the O/S, as well as the storage device hardware. If a number of CEP2 WUs try to start up simultaneously, the system can freeze for many seconds, possibly because file creation is probably a single-threaded process, and creating all those files just overwhelms it. In that situation, other BOINC tasks can experience timeouts, causing them to exit. "No heartbeat from client" is a common error message. I haven't looked at how much O/S overhead there is in accessing all of these files once the main part of the CEP2 calculations are underway, but with so many files it must be considerable. 2. Page faults: I suggest you have a look at this parameter in your task manager (View >> Select columns). CEP2 produces heaps of these, and I suspect that they too are single-threaded in the O/S. Hope that helps. BTW, the CEP2 science team are working on upgrades to the program. No details are known, but fingers crossed they may be addressing some of these problems. |
||
|
|
Aperture_Science_Innovators
Advanced Cruncher United States Joined: Jul 6, 2009 Post Count: 139 Status: Offline Project Badges:
|
I'm running 4 GFAM/DSFL and the performance tab of the TM says that 3.54GB of physical RAM is committed [overall]. See Firefox gobbling 654MB, what else if you sort the processes to size? What is confusing, to me is, that the screenshot implies there are 6 CEP2 jobs running. That's 'testing', squeezing this into 4GB [WCG specs 1GB per job], meaning very high I/O from RAM to VM [on disk] and storage of the checkpoints too. I just checked again on my laptop--Firefox is currently using 1.1GB, Adobe Flash Player and Reader are both at about 120MB, and there are eight WCG tasks each using 75-100mb or so. But there are also about 90 other processes running, so it does add up. The laptop is running on a SSD, so I assume that the additional disk throughput is sufficient that 6 CEP2 WUs at once are OK? CEP2 is the project I'm most interested in, so I try to run it as much as I think my hardware is capable of. Hi Aperture_Science_Innovators, Another good idea shot down! 12 GB virtual memory is about right for the load that Task Manager showed. I suppose that you have Windows set to automatically increase the VM Cache size if required. Which is how you should have it set. Right now I am out of suggestions. Lawrence I would have thought that 4GB + 12GB should be sufficient. I actually just checked now, and the AMD 4P is running on a 5400RPM laptop drive (I forgot that I had set it up like that). Is it possible that the speed of the disk was negatively impacting things? Thanks guys! ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Refer to what Rickjb and Lawrence already wrote... yes, the IO to a 5400 RPM spinning drive will
|
||
|
|
|