Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 17
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4211 times and has 16 replies Next Thread
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computer Stability

Maybe an automated app_config should be loaded from the CEP2 science...so that it runs max 2 apps concurrently...or is that too much for the WCG / CEP2 team? ;)
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Apr 22, 2015 9:23:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
ravenigma
Cruncher
USA
Joined: Oct 3, 2012
Post Count: 47
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computer Stability

Hey everyone. Thanks for the responses. Here's some additional info:

I doubt overheating is an issue. I've got a pretty good cooling setup and temps are nearly always below 50C. Currently running 6 UGM tasks with 40C reported by AI Suite 3 and 52C Package Temp reported by RealTemp. This is pretty much the same temps I see with CEP.

I have 16GB RAM.

Currently I'm running at 1.184V at 4400MHz (i7-4790k). I'm running at the settings my mobo (Asus Sabertooth Z97) automatically applied when I built the computer.

I have another computer running W7-64 Ult which has no problem whatsoever with anything I throw at it. It has i7-3770k OC'd to 4.2GHz with 16GB RAM. It's been crunching for years without ever having an issue. I usually keep 5 - 6 CEP tasks going at a time on there, plus GPUGrid on the GPU.

Thanks again. I'm seeing a lot of good ideas to consider.
----------------------------------------

[Apr 22, 2015 12:08:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jim1348
Veteran Cruncher
USA
Joined: Jul 13, 2009
Post Count: 1066
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computer Stability

I have another computer running W7-64 Ult which has no problem whatsoever with anything I throw at it.

How do the disk drives compare on the two computers? With the very high write-rates of CEP2 (don't know about FAAH), it is quite possible that you have write-contention when accessing the drive. Some SSDs for example are rather sensitive to this, and can't take high write loads. It depends on how much cache they have, etc.

On all the machines where I run CEP2, I use either a ramdisk (Primo Ramdisk or Dataram RAMdisk), or else a write cache (PrimoCache), and basically don't get errors.

For using a ramdisk, you first create one of large enough size, which depends on how many CEP2 work units you want to run at once; about 1.5 GB per work unit is about right, and then install BOINC so as to place the BOINC Data Folder on the ramdisk. Then all the reads and writes are to main memory, which is much faster than any drive. Setting up a write-cache is somewhat simpler, since you don't have to change the BOINC data location, just create the cache; a couple of gigabytes should do it, though I use more. Then the writes all go into main memory, and the old data is flushed out to the disk drive when the cache fills up. It will solve any problems due to high write rates.
----------------------------------------
[Edit 1 times, last edit by Jim1348 at Apr 22, 2015 1:28:09 PM]
[Apr 22, 2015 1:21:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
flynryan
Senior Cruncher
United States
Joined: Aug 15, 2006
Post Count: 235
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computer Stability

Try to up the volts a few .01's just for giggles see if it stops the BSOD's. Is the RAM overclocked? If so trying running at stock speeds or underclocking just to see if the BSOD goes away.

Are you on an SSD by the way? Just curious. I have run 32 CEP2 on a single SSD before but some brands/models have problems and others not a single issue. FWIW the Intel, Sandisk, and Crucial SSD's I've used have never had a single BSOD that I can recall.

Edit: Just saw that you have a dedicated HDD which is fine, shouldn't be the cause of the problems but who knows. HDD's won't be as efficient due to large I/O of the CEP2 project but they shouldn't cause BSOD's either.
----------------------------------------
[Edit 1 times, last edit by FlynRyan at Apr 22, 2015 1:25:56 PM]
[Apr 22, 2015 1:23:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Computer Stability

I don't know if this counts as the same problem, but I too have Windows 7 Pro and when I simultaneously run 12 threads CEP2 and 12 threads UGM, some threads, usually CEP2, will reset, sometimes all the way down to zero. It's been like that since I bought these used computers and I'm loosing on average 5-10% of computing time. Many kWh down the drain, every month.

For a while I used to manually start the wus after each involuntary resetting, one by one, with one minute in-between them, but now I'm tired of doing that. Now I'm just counting on, hoping, that some of the wus will have different crunching times and therefore achieve a finishing time that's equally spread between the wus.

I suspect this problem exists because the server sometimes doesn't allow immediate upload, so some type of memory instability occurs in my computers when wus are waiting to be uploaded. I have 24GB registered ECC RAM and an HDD.

Or, maybe it's because the chipset is running at 67C? I've ordered a second 40mm fan to each of them.

I could probably mitigate the problem by crunching fewer CEP2 wus, but I don't want that. CEP2 is the main reason I'm on WCG. Maybe the above mentioned write-cache could solve the problem? Thanks Jim1348. I have enough free GBs of RAM for a write-cache but not for a RAM-disk, at least not while the hyper threading is turned on, since CEP2 requires 2GB per wu, according to WCG . The hyper threading stays turned on, but as it stands now, the 10-15% performance gain derived from the hyper threading is almost entirely lost to the involuntary resetting. A write-cache, such as Supercache Express or PrimoCache , could be the answer. The former supports NUMA, which could perhaps be a better solution in a dual board.

Each of my computers are writing about 13-15MB/s, which amounts to about 440TB/year. I need 12GB RAM for CEP2 and another 1,2GB for UGM, which comes to 13GB. Another 4GB for the OS, and I'll have 7GB over for the write-cache. It takes a 7GB write-cache 500 seconds, or little over 8 minutes, to fill. Will this be the magic bullet? Only if the HDD can write all the data without the cache filling up.


Edit:

After installing a test version of Supercache, I can now say that it doesn't work. BOINC is pretty much ignoring Supercache and continues to write to the HDD at the same rate as before. Almost nothing is being read from the cache. So I'll uninstall it.

: (
----------------------------------------
[Edit 1 times, last edit by Former Member at Apr 26, 2015 11:22:22 AM]
[Apr 25, 2015 5:12:09 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KLiK
Master Cruncher
Croatia
Joined: Nov 13, 2006
Post Count: 3108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Computer Stability

@TBMS
I would on your place:
1. run only 10 CEP2...to ahve a stable returns!
2. buy a USB drive (etc. Toshiba witha a lifetime warranty) with 16-32GB & dedicate it as a Boost drive...that would take down the R/W from the HDD!
keep us informed...
;)
----------------------------------------
oldies:UDgrid.org & PS3 Life@home


non-profit org. Play4Life in Zagreb, Croatia
[Apr 27, 2015 6:20:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Computer Stability

I've already lowered the number of CEP2 threads, but you're probably right. I'll lower them to 10, but if I could, I would run CEP2 on all 24 threads.
[Apr 28, 2015 2:45:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 17   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread