| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 17
|
|
| Author |
|
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges:
|
Maybe an automated app_config should be loaded from the CEP2 science...so that it runs max 2 apps concurrently...or is that too much for the WCG / CEP2 team? ;)
---------------------------------------- |
||
|
|
ravenigma
Cruncher USA Joined: Oct 3, 2012 Post Count: 47 Status: Offline Project Badges:
|
Hey everyone. Thanks for the responses. Here's some additional info:
----------------------------------------I doubt overheating is an issue. I've got a pretty good cooling setup and temps are nearly always below 50C. Currently running 6 UGM tasks with 40C reported by AI Suite 3 and 52C Package Temp reported by RealTemp. This is pretty much the same temps I see with CEP. I have 16GB RAM. Currently I'm running at 1.184V at 4400MHz (i7-4790k). I'm running at the settings my mobo (Asus Sabertooth Z97) automatically applied when I built the computer. I have another computer running W7-64 Ult which has no problem whatsoever with anything I throw at it. It has i7-3770k OC'd to 4.2GHz with 16GB RAM. It's been crunching for years without ever having an issue. I usually keep 5 - 6 CEP tasks going at a time on there, plus GPUGrid on the GPU. Thanks again. I'm seeing a lot of good ideas to consider. ![]() |
||
|
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges:
|
I have another computer running W7-64 Ult which has no problem whatsoever with anything I throw at it. How do the disk drives compare on the two computers? With the very high write-rates of CEP2 (don't know about FAAH), it is quite possible that you have write-contention when accessing the drive. Some SSDs for example are rather sensitive to this, and can't take high write loads. It depends on how much cache they have, etc. On all the machines where I run CEP2, I use either a ramdisk (Primo Ramdisk or Dataram RAMdisk), or else a write cache (PrimoCache), and basically don't get errors. For using a ramdisk, you first create one of large enough size, which depends on how many CEP2 work units you want to run at once; about 1.5 GB per work unit is about right, and then install BOINC so as to place the BOINC Data Folder on the ramdisk. Then all the reads and writes are to main memory, which is much faster than any drive. Setting up a write-cache is somewhat simpler, since you don't have to change the BOINC data location, just create the cache; a couple of gigabytes should do it, though I use more. Then the writes all go into main memory, and the old data is flushed out to the disk drive when the cache fills up. It will solve any problems due to high write rates. [Edit 1 times, last edit by Jim1348 at Apr 22, 2015 1:28:09 PM] |
||
|
|
flynryan
Senior Cruncher United States Joined: Aug 15, 2006 Post Count: 235 Status: Offline Project Badges:
|
Try to up the volts a few .01's just for giggles see if it stops the BSOD's. Is the RAM overclocked? If so trying running at stock speeds or underclocking just to see if the BSOD goes away.
----------------------------------------Are you on an SSD by the way? Just curious. I have run 32 CEP2 on a single SSD before but some brands/models have problems and others not a single issue. FWIW the Intel, Sandisk, and Crucial SSD's I've used have never had a single BSOD that I can recall. Edit: Just saw that you have a dedicated HDD which is fine, shouldn't be the cause of the problems but who knows. HDD's won't be as efficient due to large I/O of the CEP2 project but they shouldn't cause BSOD's either. [Edit 1 times, last edit by FlynRyan at Apr 22, 2015 1:25:56 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I don't know if this counts as the same problem, but I too have Windows 7 Pro and when I simultaneously run 12 threads CEP2 and 12 threads UGM, some threads, usually CEP2, will reset, sometimes all the way down to zero. It's been like that since I bought these used computers and I'm loosing on average 5-10% of computing time. Many kWh down the drain, every month.
----------------------------------------For a while I used to manually start the wus after each involuntary resetting, one by one, with one minute in-between them, but now I'm tired of doing that. Now I'm just counting on, hoping, that some of the wus will have different crunching times and therefore achieve a finishing time that's equally spread between the wus. I suspect this problem exists because the server sometimes doesn't allow immediate upload, so some type of memory instability occurs in my computers when wus are waiting to be uploaded. I have 24GB registered ECC RAM and an HDD. Or, maybe it's because the chipset is running at 67C? I've ordered a second 40mm fan to each of them. I could probably mitigate the problem by crunching fewer CEP2 wus, but I don't want that. CEP2 is the main reason I'm on WCG. Maybe the above mentioned write-cache could solve the problem? Thanks Jim1348. I have enough free GBs of RAM for a write-cache but not for a RAM-disk, at least not while the hyper threading is turned on, since CEP2 requires 2GB per wu, according to WCG . The hyper threading stays turned on, but as it stands now, the 10-15% performance gain derived from the hyper threading is almost entirely lost to the involuntary resetting. A write-cache, such as Supercache Express or PrimoCache , could be the answer. The former supports NUMA, which could perhaps be a better solution in a dual board. Each of my computers are writing about 13-15MB/s, which amounts to about 440TB/year. I need 12GB RAM for CEP2 and another 1,2GB for UGM, which comes to 13GB. Another 4GB for the OS, and I'll have 7GB over for the write-cache. It takes a 7GB write-cache 500 seconds, or little over 8 minutes, to fill. Will this be the magic bullet? Only if the HDD can write all the data without the cache filling up. Edit: After installing a test version of Supercache, I can now say that it doesn't work. BOINC is pretty much ignoring Supercache and continues to write to the HDD at the same rate as before. Almost nothing is being read from the cache. So I'll uninstall it. : ( [Edit 1 times, last edit by Former Member at Apr 26, 2015 11:22:22 AM] |
||
|
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges:
|
@TBMS
----------------------------------------I would on your place: 1. run only 10 CEP2...to ahve a stable returns! 2. buy a USB drive (etc. Toshiba witha a lifetime warranty) with 16-32GB & dedicate it as a Boost drive...that would take down the R/W from the HDD! keep us informed... ;) |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've already lowered the number of CEP2 threads, but you're probably right. I'll lower them to 10, but if I could, I would run CEP2 on all 24 threads.
|
||
|
|
|