Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 26
|
![]() |
Author |
|
Bearcat
Master Cruncher USA Joined: Jan 6, 2007 Post Count: 2803 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Woke up this morning to find 54 GPU wu's and stuck on one of them. Waited until it would hit 10 minutes, then aborted it. Boinc went to the next one, screen freeze, then a message of driver recovery pops up. Then it would stick on that one but not crunch (verified by watching CCC). Aborted all of them and unselected from HCC project until I can find out what happened.
----------------------------------------I thought it was posted (couldn't find it) that there was a limit per machine for GPU wu's. My other cruncher with a 5670 is still crunching fine though received about the same number of them too. So far, the GPU is functioning fine without crunching with it. Even had the fan kicked up to 50% to keep it cool so I know it didn't overheat. At a loss right now as to why but hope to figure out the issue. Anyone else with AMD cards having issues like this?
Crunching for humanity since 2007!
![]() |
||
|
mikey
Veteran Cruncher Joined: May 10, 2009 Post Count: 824 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Woke up this morning to find 54 GPU wu's and stuck on one of them. Waited until it would hit 10 minutes, then aborted it. Boinc went to the next one, screen freeze, then a message of driver recovery pops up. Then it would stick on that one but not crunch (verified by watching CCC). Aborted all of them and unselected from HCC project until I can find out what happened. I thought it was posted (couldn't find it) that there was a limit per machine for GPU wu's. My other cruncher with a 5670 is still crunching fine though received about the same number of them too. So far, the GPU is functioning fine without crunching with it. Even had the fan kicked up to 50% to keep it cool so I know it didn't overheat. At a loss right now as to why but hope to figure out the issue. Anyone else with AMD cards having issues like this? REBOOT THE PC, in Windows, I am assuming you are using Windows, has NO WAY to reset the gpu without a full restart of the pc. Linux CAN reset the gpu but Windows can NOT!! This often fixed the problems but if not check for gpu driver updates and IF you have very recently done an update and are at the most recent try rolling back to an older version. Gpu drivers re first and foremost for gamers, and that is NOT always beneficial to us crunchers. ![]() ![]() |
||
|
Bearcat
Master Cruncher USA Joined: Jan 6, 2007 Post Count: 2803 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I did reboot. Same issues. Hangs on a wu, I abort, starts another, message pops up about driver recovery, hangs. Have never had an issue with this card before crunching any GPU project. When I get time, will uninstall and reinstall the card drivers in hope this fixes it. Until then, am out crunching GPU.
----------------------------------------
Crunching for humanity since 2007!
![]() |
||
|
ryan222h
Senior Cruncher Joined: Sep 4, 2006 Post Count: 425 Status: Offline |
Uninstall all AMD drivers, then reboot and download and install the most recent one. How's the GPU temps? You can download GPU-z to find out. Even if the fans are running does't mean its cooling the right place, if a heatsink came loose or something. Just guessing here.
----------------------------------------![]() |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Woke up this morning to find 54 GPU wu's and stuck on one of them. Waited until it would hit 10 minutes, then aborted it. Boinc went to the next one, screen freeze, then a message of driver recovery pops up. Then it would stick on that one but not crunch (verified by watching CCC). Aborted all of them and unselected from HCC project until I can find out what happened. I thought it was posted (couldn't find it) that there was a limit per machine for GPU wu's. My other cruncher with a 5670 is still crunching fine though received about the same number of them too. So far, the GPU is functioning fine without crunching with it. Even had the fan kicked up to 50% to keep it cool so I know it didn't overheat. At a loss right now as to why but hope to figure out the issue. Anyone else with AMD cards having issues like this? I've had 1 get stuck so far. I just suspended all tasks, including those in cache, waited about 10 seconds and then restarted them. Stuck 1 started over but did finish. Don't suspend form the projects tab when doing this. Use the tasks tab and manually select them all. See this all the time on POEM. Same solution works there too.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
mmstick
Senior Cruncher Joined: Aug 19, 2010 Post Count: 151 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Possible causes:
----------------------------------------Corrupt driver install Unstable Overclock/Voltage Unstable system memory Overheated Intermittent power supply issues not supplying GPU with enough power Also mikey159b, Windows can restart the GPU driver easily. The message of the driver being recovered is an example of that. In the past Windows would BSOD instead of restarting the driver, but it now restarts the driver. Also, GPU drivers are not first and foremost [Edit 1 times, last edit by mmstick at Oct 13, 2012 6:52:01 PM] |
||
|
Bearcat
Master Cruncher USA Joined: Jan 6, 2007 Post Count: 2803 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
With the fan manually set at 50%, temps stayed around 70C. Yesterday I was testing different settings in CCC. Did not max out settings but came close. Didn't like the temps so backed the settings in overdrive on CCC to half. Watched it for awhile and things seemed good. Have fans on manual to 50% while crunching GPU.
----------------------------------------When I get a chance, will uninstall the drivers, reboot, then install them again to see if I get the same problem. Thanks for all the suggestions. Nano, will try your suggestion if it does it again. Dont think it will help though. When I saw the stuck wu, I aborted it, boinc went to the next GPU wu and did the same issue again.
Crunching for humanity since 2007!
![]() |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Nano, will try your suggestion if it does it again. Dont think it will help though. When I saw the stuck wu, I aborted it, boinc went to the next GPU wu and did the same issue again. I found more stuck today and it's because the driver was crashing. The driver would restart but the task didn't unless I suspended and restarted. Tried newest beta driver(12.9) problem was worse. 12.8 was the driver crashing so I rolled back to 12.6. So far so good.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
mmstick
Senior Cruncher Joined: Aug 19, 2010 Post Count: 151 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Nano, will try your suggestion if it does it again. Dont think it will help though. When I saw the stuck wu, I aborted it, boinc went to the next GPU wu and did the same issue again. I found more stuck today and it's because the driver was crashing. The driver would restart but the task didn't unless I suspended and restarted. Tried newest beta driver(12.9) problem was worse. 12.8 was the driver crashing so I rolled back to 12.6. So far so good. I've been using the 12.8 driver perfectly with my 6850s and 7950. It sounds like the OpenCL component on your installation was in error. Be wary that Windows tends to not upgrade the OpenCL part correctly when installing a new driver. It is good practice to manually uninstall the opencl package when upgrading to the next driver version. |
||
|
Bearcat
Master Cruncher USA Joined: Jan 6, 2007 Post Count: 2803 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Nano, will try your suggestion if it does it again. Dont think it will help though. When I saw the stuck wu, I aborted it, boinc went to the next GPU wu and did the same issue again. I found more stuck today and it's because the driver was crashing. The driver would restart but the task didn't unless I suspended and restarted. Tried newest beta driver(12.9) problem was worse. 12.8 was the driver crashing so I rolled back to 12.6. So far so good. I downloaded 12.4 but after the install, it was actually 11.2. Even the file showed 12.4. Weird. Went ahead and downloaded 12.8 and installed. Waiting on boinc to download some GPU wu's to see if its fixed. Dam CCC was hell to uninstall. Took 2 uninstalls and reboots to finally get rid of all of it. 1st time I had this issue. Wish I would have left 12.4 I had in there as never had issues with betas before. Have a feeling 12.8 is the culprit. Nano, are you using CCC? If so, are your settings in overdrive maxed out?
Crunching for humanity since 2007!
----------------------------------------![]() [Edit 1 times, last edit by Bearcat at Oct 14, 2012 12:48:01 AM] |
||
|
|
![]() |