| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 9
|
|
| Author |
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
Hi,
----------------------------------------since January 2019, I experienced several times (during the last 2 months about once weekly) a crash of a Windows 7 Pro x64 machine (i7 4770K, 16 GB RAM, no OC) computing MCM1 only. Within the last 12 years, I did very rarely experience real troubles caused by a WCG project on the machine where it runs. At this time, the repetition of troubles is relatively constant and there is no real other reason excepted MCM1 for justifying the crashes. It does not annoy me much since the concerned machine is not a "productive" machine. However, I can imagine that I am not the only one member impacted by such troubles. Cheers, Yves |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
Since this is only happening rather recently, I suspect there is some hardware problem. There may be some piece of your machine which is only malfunctioning periodically. I also suspect that over time the crashes will get more frequent as the affected item will further deteriorate. As to what it could be I can only speculate. Perhaps the power supply is malfunctioning and not giving a smooth supply of electricity i.e. small minisurges which may be affecting other components. Another remotely possible possibility could be some minor corruption in the OS which may cause some pipelining lock. I think the better possibility is on the hardware side.
----------------------------------------I have one machine with Windows 7 Utimate 64 which has been running exclusively MCM for a long time and it has not given me any problems so far. Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges:
|
I second that. Crashes are almost always hardware, if you mean the machine either spontaneously rebooted or else froze up.
----------------------------------------Assuming you are not overclocking anything (memory, CPU, PCIe bus, etc.), then it is likely either bad memory, a bad disk drive, a bad video card or even a bad motherboard. I have had both an old Intel Z75 board and a new AMD X370 board go out on me in the last year or so. The SSDs can easily go bad too and cause crashes, even if the don't hold the OS but only hold data. I have also had to ditch at least two video cards in the last year or so. I have run MCM on both Windows and Ubuntu machines, both Intel and AMD for many years in various combinations, and have never seen them cause a crash. I think you need to work at isolating a hardware cause. EDIT 1: Check the temps of your CPU and video cards also. If they are too hot, they will crash. EDIT 2 SATA cables to the disk drive can go bad and cause crashes. Just replace them to be safe. [Edit 2 times, last edit by Jim1348 at Apr 19, 2019 1:45:46 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've been running MCM exclusively on a Windows 10 i-6700k machine and on linux and it never "crashes". Do you mean a blue screen, machine freezes or reboots? I always like checking the event log first for clues when I'm having system problems. I doubt MCM is the cause. Old power supplies are notorious for making systems unstable. Also, it's good to look at the electrolytic capacitors on the motherboard and look for signs of bulging or leaking. You can also download a memory checker image and boot from a stick or CD-ROM. They often come bundled with linux distributions. Note that failing memory tests can mean memory, power supply, or motherboard.
|
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
Hi,
----------------------------------------I thank you for the advices. The machine is clean and hardware is OK so far. It is not a blue screen but a kind of freeze with total or partial stop of various services incl. boinc. After a reboot, it everything OK again. I did experience as well some WU in errors, giving me the feeling that some WUs are "buggy". I was just willing to collect some feedbacks for better investigation on my side. BTW: the only active screensaver is the blank screen (no WCG/Boinc screensaver). Cheers, Yves |
||
|
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7846 Status: Offline Project Badges:
|
On my Win 7 machine which runs MCM exclusively, I have not had a work unit come back with an error or invalid for as long as I can remember, well over a couple of years. I would highly discount the theory that the work units were buggy, although it is possible. I am back to the "hardware problem" theory. Bon chance.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|
iksnah
Cruncher Joined: Apr 26, 2007 Post Count: 17 Status: Offline Project Badges:
|
A heat problem?
----------------------------------------
<><
|
||
|
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges:
|
Do you have enough memory? Check Task Manager to see how much you have left.
|
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
I think there is some software problems on my machine.
----------------------------------------Again this morning, the User Interface froze, however the WUs were still running. It was not possible to start any application and I was only able to initiate a reboot from the command line. After the reboot, the CPU efficiency of the concerned WU displayed by BoincStats was completely out of range, between 200% to 350%. The latter is reproducible after such "freeze". It is not a memory problems (16 GB), and not a heat problem. It seems that some Windows services die over the time, however I do only experience this particular situation on this machine with MCM1. Again, thank you for your inputs. It will probably remain a mystery. Cheers, Yves |
||
|
|
|