| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 36
|
|
| Author |
|
|
Wolf Fivousix
Cruncher Joined: Mar 22, 2013 Post Count: 10 Status: Offline Project Badges:
|
Hello community, I am desperate need of some help. I have been donating my computing power for 2 years now and since I started I have been afflicted by a new kind of computer problem, my computer just freezes, the display freezes, no inputs work, sounds go off. At first I thought it was my old computer and so I bought a new one (it was time anyway), after six months the problem reappeared. By the time it got unmanageable I identified my motherboard (Gigabyte 990FXA-UD3) as defective (some transistor burn out) and so bought new one (ASUS Sabertooth 990FX R2.0) last month.
For this month everything seemed just perfect, until a couple of days ago when the problem suddenly came back to haunt me. Based on previous troubleshooting and since the BOINC is very CPU intensive and I always run it on 100% (no overclock whatsoever), I decided to change some of the configurations. Lowered to 50%, problem started happening more often. Placed it down on 10%, still freezing. Not using BOINC, no problems even when playing the most intensive games out in the market today with max settings for hours (which wouldn't happen with the Gigabyte motherboard, as it got defective). Is there anyone with a similar problem? Is there any kind of conflict between so many different hardwares that may cause this? Any help on this issue is much appreciated, as I would not like to stop donating idle computing power to humanitarian causes, but to this point, I see no alternative. Thank you very much! My computer specifications: Win 7 64bits MB: ASUS Sabertooth 990FX R2.0 CPU: AMD Vishera 4.7Ghz FX-9590 Memory: 2x G.SKILL Ripjaws 8GB SDRAM 1600 GC: EVGA GeForce GTX760 4GB 04G-2768-KR PS: Fractal Design Newton R3 1000W SSD: Kingston 120GB SV300S37A120G (OS installed here) HD: Westen Digital 1TB WD10EZEX-00KUWA0 (BOINC installed here) |
||
|
|
MrKermit
Advanced Cruncher Joined: Jun 13, 2009 Post Count: 95 Status: Offline Project Badges:
|
I cannot say for sure obviously, but the first place I would check is defective RAM. Are you re-using the same DIMM's and CPU each time?
----------------------------------------The DIMMs can cause all manor of Havoc and are simple enough to swapout for a Test. At a minimum try Re-seating them in the slots, which is effectively what happens each time you swap the mainboard on the system in this Hypothesis. Assuming you have more than 1, put them in different slots from the one you pulled them out of too. You could also download Memtest as an ISO and run it for a while to see if it detects any problems. Another Possibility is the CPU fan isn't keeping up anymore... they can get plugged with lint etc, so try blowing the box out with compressed air as well. My guess is that BOINC is using more RAM or using differently, which is why other workloads don't produce the lockups as easily. Cheers! MrKermit ![]() |
||
|
|
Wolf Fivousix
Cruncher Joined: Mar 22, 2013 Post Count: 10 Status: Offline Project Badges:
|
Hi Mr. Kermit, thank you for the insight.
I am indeed using the same CPU and RAM from the previous motherboard (not from the previous computer), I ran Memtest for 48h when I first bought the new computer over a year ago, but have not re run it, I will give it a try. I wouldn't say CPU fan is an issue, I use a double radiator water cooler to keep heating to a minimum, never got over 65ºC in the hottest days. (And I cleaned everything when changed the motherboard, I am also using a good Artic Silver thermal paste with the "drop" method.) The DRAM swapping did not produce any change, that is why I never considered it to be defective, but the hypothesis of BOINC using the memory in a different way have never crossed my mind, any idea of how I could test it? Maybe there is a parameter on BOINC for RAM use that I could fiddle with? Thank you for the help. |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7844 Status: Offline Project Badges:
|
Your cpu is AMD Vishera 4.7Ghz FX-9590. this is an 8 core running at 4.7 ghz. That has to throw a lot of heat. I would go with Mr. Kermit's suggestion of making sure heat dissipation is not the problem. the other thing you could try is to reduce your core count. Go to only one core in use for BOINC and see if the freeze ups continue. If one runs OK, continue to increment by one until you see where the problem recurs. Here is an excerpt from HardwareCanucks review of this chip from July 15, 2013:
----------------------------------------As one might expect, actually getting a lower clocked architecture to hit such high levels requires some heavy-duty muscle alongside stringent binning. In this case, a massive amount of voltage -1.5V- has been applied and this has a secondary, nasty side effect: a substantial increase in heat production and power requirements. While the FX-8350’s TDP of 125W was deemed inefficient, the FX-9590 brings things to a whole new level with an estimated thermal output of 200W-220W. That’s an important number to remember when choosing a cooler since only the very best air-based solutions will be able to keep temperatures under control. Hope this is helpful. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
MrKermit
Advanced Cruncher Joined: Jun 13, 2009 Post Count: 95 Status: Offline Project Badges:
|
I am suggesting that the RAM has possibly gone bad over time, and should be re-tested. My Servers all have ECC ram, so we can push a lot harder a lot longer without crashes. It would be worth re-running.
----------------------------------------The Watercooler is great... it could theoretically get choked with Dust too, but much less likely especially if you are watcdhing the CPU temps proactively. Also, since you're running windows, could the screen saver graphics be related? I run headless with no graphics at all so I can't really guess what drivers in windows would get into trouble... but consider disabling the Graphics and/or running BOINC from the command line to test if there's some issue there that happens only when the system is drawing pictures. You might try to start a command prompt window, CD to the BOINC directory and just run boinc.exe by hand if you can. Reseating the RAM and retesting it is still my favorite hypothesis. I'll let you know if I think of more ideas. You could set your boinc preferences to use projects that use less RAM or less HD Space: http://www.worldcommunitygrid.org/help/viewTopic.do?shortName=minimumreq Also, a scan of your local hard drive is a remote possibility if you are out of space or having a lot of errors, or your swap file is getting fragged? Just brainstorming, but I hope it helps! Cheers! Greg ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
so why is this in the GPU support when it has nothing to do with GPU?
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
'the display freezes' ;>)
(We need a tech forum, which could be more active than chat :O) Back on topic... RAM... crunching 8 jobs can hit the high regions [of temp too]. Nothing to add though to the advise already given, but to propose TThrottle for Windows with which a temp ceiling can be set above which BOINC would slow down or pause till C/F drops below the threshold again. Pity the sub-second throttle control in BOINC was removed again after testing [some projects elsewhere could not handle it]. That really gave a smooth 'foot-off-the-metal' crunch experience. |
||
|
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges:
|
can confirm that RAM is usuall suspect when the computer hangs/freezes...usually that means you can move mouse, but nothing else...no selection, no clicking, etc.
----------------------------------------if it the CPU had a heat problem, MOB would restart the PC or shut it down... if it was GPU, then there would be no picture... if it was GPU mem, then the picture would be garbled... |
||
|
|
katoda
Senior Cruncher Poland Joined: Apr 28, 2007 Post Count: 172 Status: Offline Project Badges:
|
Another hint: check you power supply. It can be defect , due to abnormal voltages killed your old MB and now is slowly killing the current one. Having BOINC working on 100% means that the CPU sucks a lot of power, combine this with a PSU which can give, let's say, 50% of the designated power and the result can be like described in the first post.
----------------------------------------![]() |
||
|
|
Coleslaw
Veteran Cruncher USA Joined: Mar 29, 2007 Post Count: 1343 Status: Offline Project Badges:
|
Well.. a failing HDD or even a "weak" one can have similar symptoms. From your badge, I can only guess CEP2 is running on your system... Perhaps giving more insight on what projects and what work units you are running...
----------------------------------------![]() ![]() ![]() ![]() |
||
|
|
|