| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 161
|
|
| Author |
|
|
twilyth
Master Cruncher US Joined: Mar 30, 2007 Post Count: 2130 Status: Offline Project Badges:
|
I have a similar problem with one rig that is running a PCI 3 card in a PCI 2 slot. If the screen saver comes on, I get an error saying that the gpu is no longer working or is missing.
----------------------------------------I also get the problem if I log in remotely an minimize the screen. Finally what I did was start boinc on that machine as follows c:\program files\boinc\boinc.exe --allow_remote_gui_rpc Then I went to the Programdata directory and found the gui_rpc_auth.cfg file. The string in there is the boinc password. You can change this to something else I think. I just copied it. Then back on the remote host, I logged in from boinc manager with that pw and everything is working fine. I do have to run 2 BM's to see everything but that's fine. I just had to set the power options on the target machine so that there was no screensaver and the monitor never powers down. That's ok since it's on a KVM anyway and switching back and forth on that doesn't seem to matter. ![]() ![]() [Edit 1 times, last edit by twilyth at Nov 25, 2012 11:52:31 PM] |
||
|
|
pfm3136
Cruncher Joined: Apr 11, 2010 Post Count: 13 Status: Offline Project Badges:
|
Uplinger, 2 machines running GPU tasks, both had the reg fix applied and rebooted as per your instructions. The registry fix hasn't solved the problem unfortunately. The machines have multiple GPUs configured with app_info.xml and stuck units aren't limited to any particular GPU. stderr doesn't appear to show anything out of the ordinary and if the units are suspended then resumed, they complete within a few minutes (compared to the hours they have been sitting idle). BOINCTasks sometimes shows high CPU activity which I presume means that image one is completing. Is there anything else I can check? Also, do you have a link for the Microsoft fix to see if there are any more pointers? Thanks again for your time. I had the same problem, don't know if it aplies to you, but disable crossfire/sli and it should be ok. That reg fix for the watchdog is for another problem (you should see the driver crashing and recovering wich would "kill" all the work units on the gpu affected), unfortunately doesn't work on windows 8. |
||
|
|
coolstream
Senior Cruncher SCOTLAND Joined: Nov 8, 2005 Post Count: 475 Status: Offline Project Badges:
|
Thanks for all the input, guys. A lot of the suggestions were not relevant because the power management settings suggestions were the same as I was already using, however I have been testing them to see if any one in particular might make a noticeable difference.
----------------------------------------I have now set both machines to use non-Aero and will check again later to see if I get any more stuck units. @ pfm3136, although I have multi GPUs in each machine, I'm not using crossfire, so that one can be easlily ruled out. @ twilyth, interesting observations about remote connections. I had to stop using remote desktop on the GPU machines for this very reason. That was in the time of single image units, but I never got a situation like this until the advent of dual image units so I'm almost sure that it's not the culprit here. re gui_rpc_auth.cfg and monitoring, instead of using two BMs you might want to conside BoincTasks which lets you monitor multiple machines in one interface. It is highly configurable and can make life so much easier. http://www.efmer.eu/boinc/boinc_tasks/ Thanks again to you all. It's comforting to know that I'm not the only one, even though it's so annoying because I never had these issues before the WUs changed from single units. ![]() Crunching in memory of my Mum PEGGY, cousin ROPPA and Aunt AUDREY. |
||
|
|
rilian
Veteran Cruncher Ukraine - we rule! Joined: Jun 17, 2007 Post Count: 1460 Status: Offline Project Badges:
|
hi - i see HCC has now 111 days before end -- http://i137.photobucket.com/albums/q210/Sekerob/WCGYearsPi1Project.png
----------------------------------------is there any official announcement that more tasks are added to the GRID ? thanks! |
||
|
|
Zigfried
Senior Cruncher Brazil Joined: Dec 12, 2005 Post Count: 368 Status: Offline Project Badges:
|
This project ran out of WU for 2 or 3 days.
----------------------------------------That chart shows the time based on avarage of works per day and those days without work made a little mess on it. In a few days it will be ok. But i dont know why it is showed as PAUSED. ![]() |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1404 Status: Offline Project Badges:
|
This project ran out of WU for 2 or 3 days. Only a little mess!That chart shows the time based on avarage of works per day and those days without work made a little mess on it. In a few days it will be ok. But i dont know why it is showed as PAUSED. IIR Sekerob's explanation correctly: The highest and lowest value of the last 4 weeks/30 days are discarded to calculate the estimated lifetime. So one of the 'empty' days doesn't count. But sure it will be less than 111 days. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It's a mess, because the double image switch got the double whammy effect from the few days no new work shortly after. The "trimmed mean" works on a shorter period, so now it's 124 days ;P and "Paused" is now "Normal" [the old intertube cache view] :D It takes 10 days to get back to a "clean" average again. At the moment HCC is overweight [in total runtime] http://bit.ly/WCGQLK , so it might get a downward share adjustment [measured in results of total daily processed for WCG].
----------------------------------------Trimmed Mean on MS office works by specifying a percent of high/low you want to exclude from the average. If the object is to remove only the single highest and lowest on last 30 days, the percent factor to dismiss is 2/30th i.e. 0.067. Enter 0.134 and it will dismiss the two highest and lowest, et cetera et cetera. At any rate, not going to touch the algo's, with the exception that from an arbitrary point I've assumed all results to be double image [to get back to the total completed images as a fraction of the overall estimated images that were guessed to be there at the last word of "so much added to...". Anyway, whenever Viktors has word on new estimates [he's the man doing the project duration planning], we'll surely hear. Still think this project will go intermittent at some point. There's always new sets of crystal being generated somewhere on the globe that can ** make use of the HCC system. ** edit: and have been making use of the HCC gateway to the grids processing power. [Edit 1 times, last edit by Former Member at Nov 26, 2012 4:08:44 PM] |
||
|
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges:
|
hi - i see HCC has now 111 days before end -- http://i137.photobucket.com/albums/q210/Sekerob/WCGYearsPi1Project.png is there any official announcement that more tasks are added to the GRID ? thanks! Probably lesss than that if everybody starts to add more and more GPUs. ![]() |
||
|
|
twilyth
Master Cruncher US Joined: Mar 30, 2007 Post Count: 2130 Status: Offline Project Badges:
|
Coolstream: Thanks for the info on boinctasks, but I don't like to use anything not boinc supported. Lots of nice utilities come and go. What happens is you become dependent upon them and then one day they don't work any more and no one wants to step and support them.
----------------------------------------![]() ![]() |
||
|
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges:
|
Coolstream: Thanks for the info on boinctasks, but I don't like to use anything not boinc supported. Lots of nice utilities come and go. What happens is you become dependent upon them and then one day they don't work any more and no one wants to step and support them. I doubt BoincTasks is going anywhere for a long time. The developer is always looking for feedback to make it better. What can you lose by giving it a try. It's a must if you run multiple machines IMHO.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
![]() ![]() |
||
|
|
|