Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Beta Test for Help Conquer Cancer - GPU v6.51 |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 63
|
Author |
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: |
@[HWU]Melkor:"What should I have to paste here?"
Generally, only post full details of an error if nobody has already described the same sort of error. In that case, describe what went wrong and you might copy and paste the abnormal parts of the Results Status log. If your type of error has already been reported, just state the type of error, eg "hung at 99.415%". Reading armstrdj's post again, in this thread we were asked to paste in the last few lines of the stderr.txt file in the error WU's Slot directory. Sorry I mislead you by suggesting you look at the Results Status log. The slot directory is cleared when the WU ends, so your stderr.txt is lost. Only the log in your WCG Results Status remains. You report having the default 50% throttle setting so I think that your error happened because BOINC currently can't re-start GPU WUs after they have paused during throttling. Your BOINC version etc is useful info because by examining a number of these situations the techs may discover another factor that needs to be present to trigger the error. You have already given enough details I think |
||
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges: |
Another issue with these beta tasks has been reported in the GPU Support Forum :
----------------------------------------BOINC gone crazy after completing BETA tasks After running GPU tasks, BOINC can go into a mode where it seems to ignore your work cache setting and downloads new tasks until the server estimates that your machine can't complete them by the 10-day limit. Someone might be working on a fix. --- [Edit]: Afterthought - this may be a phenomenon rather than a bug. Let's imagine that the WUs came out with their CPU time estimates the same as for normal CPU HCC tasks - say 1hr - and they ran in, say, 5 min, ie the estimates were out by a factor of 12. Your BOINC client would think your machine was on steroids, and if your work cache was set to 1 day and no more GPU tasks were available it would try to download 12 days' CPU work. But there are lots of "if"s I don't know about. [Edit 2 times, last edit by Rickjb at Sep 24, 2012 4:05:22 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Let's imagine that the WUs came out with their CPU time estimates the same as for normal CPU HCC tasks - say 1hr - and they ran in, say, 5 min, ie the estimates were out by a factor of 12. Your BOINC client would think your machine was on steroids, and if your work cache was set to 1 day and no more GPU tasks were available it would try to download 12 days' CPU work. Looks like there needs to be a separate WU-cache for GPU-based WUs so that the logic for calculations of cache-sizes are independent of one another. That is: the CPU-based-WUs interact only with the cache for CPU-based-WUs while GPU-based-WUs interact only with the cache for GPU-based-WUs.; |
||
|
RetiredTech
Advanced Cruncher Canada Joined: Feb 2, 2012 Post Count: 91 Status: Offline Project Badges: |
I recently received a WU for this BETA and it hung at 99.415 %. Suspending and rebooting did nothing. I moved NVIDIA Physx to CPU and rebooted. This time the WU completed.
I have use at most 80% CPU. Use always. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Another issue with these beta tasks has been reported in the GPU Support Forum : BOINC gone crazy after completing BETA tasks After running GPU tasks, BOINC can go into a mode where it seems to ignore your work cache setting and downloads new tasks until the server estimates that your machine can't complete them by the 10-day limit. Someone might be working on a fix. --- [Edit]: Afterthought - this may be a phenomenon rather than a bug. Let's imagine that the WUs came out with their CPU time estimates the same as for normal CPU HCC tasks - say 1hr - and they ran in, say, 5 min, ie the estimates were out by a factor of 12. Your BOINC client would think your machine was on steroids, and if your work cache was set to 1 day and no more GPU tasks were available it would try to download 12 days' CPU work. But there are lots of "if"s I don't know about. Though there is as yet no separate buffer control for CPU and GPU [no one has made a compelling case at the developers FAICR for such user configurable settings], GPU and CPU have separate schedulers. Knreed posted that he'd fixed some bug, working with Berkeley to stop this occurring, but seemingly... At any rate, my latest 7 test client still has a single DCF per project [speak, WCG is a single project to BOINC], and the <dont_use_dcf/> entry is there but not operational. Choice *was* a number of months ago not to use that feature [disabling the DCF in effect]. Maybe/maybe not they are related. Long as the run time estimates are set right for GPU Beta, this should not affect CPU side, moreover it's still Beta, so there is no cross feeding of run times averages between GPU and CPU, don't expect there to be since is is a separate feed. This is something which has all to be learned on the go with this first for WCG. |
||
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges: |
At any rate, my latest 7 test client still has a single DCF per project [speak, WCG is a single project to BOINC], and the <dont_use_dcf/> entry is there but not operational. Well, funny how DCF for WCG has been stuck at 1.000000 for some time now if <dont_use_dcf/> isn't working... Just looking on the tasks being "ready to report", with cpu-times between 14165.25 and 22219.69 seconds it's strange DCF haven't been changed by any of these when all of them had fpops_est of 54028909110941. "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
SekeRob, you mentioned in your other posts, that the DCF is now on the server. Now is that a DCF for a CPU-based-WU? We may need to have a DCF for a GPU-based-WU to operate on a cache for GPU-based-WUs. I can imagine further difficulties if GPUs are not given its own space so to speak, and force GPUs to be 'just another CPU' in a manner of speaking. Already the over-downloading of CPU-based WUs is apparently caused by the servers misreading a reported done-GPU-based-WU as reflective of a client's CPU-performance and may have thus triggered a server command to fill the client's cache based on a DCF that is supposedly for CPU-based-WUs only but which DCF the GPU-based-WUs performance may have modified.
; |
||
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges: |
SekeRob, you mentioned in your other posts, that the DCF is now on the server. Now is that a DCF for a CPU-based-WU? The server keeps track of various parameters for each individual computer, app-version and plan-class-combination. These parameters is used to give "correct" run-time estimates, but they're not used before 10 valid tasks with a particular computer, app-version and plan-class-combination. Meaning, even if GPU-tasks is originally estimated to take 1 hour but takes only 5 minutes, this factor of 12 disrepancy will not influence the estimates for CPU-tasks for v7.0.28 and later clients. v6-clients on the other hand can be completely overcommitted with CPU-work due to the incorrect GPU-estimates. Well, GPU will still have some effect in v7, if example a quad-core has a 3-day cache, when gets a string of GPU-work it will "lose" one cpu-core and this means the CPU-work now will take 4 days. There's also been various improvements and bug-fixes to BOINC-client, among these one going on estimates: •client: fix error in runtime estimation for active tasks. This change is in v7.0.34 and later, so while it's unclear if it could give excess CPU-cache, it's still a good idea to upgrade to v7.0.36."I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
katoda
Senior Cruncher Poland Joined: Apr 28, 2007 Post Count: 170 Status: Offline Project Badges: |
Looks like this wave of beta tasks is finished, just my two cents: I got 4 beta tasks on one of my machines and first of them was stuck on 99.415%. The machine is a notebook winth Windows7 54bit and nVida 540M GPU and to avoid overheating, throttle is on - use max 50% of cpu time.
----------------------------------------I turned throttle off, restarted BOINC and all beta tasks finished successfully. I observed that during computation, both CPU core and GPU were used in 100%, despite of throttle settings. I hope that this issue will be solved quickly - I think that a lot of people use this setting because of heat and noise, using 100% of a notebook CPU is not an option for me. EDIT: just to avoid confusion, "issue" means hanged workunits, not 100% CPU utilisation :) [Edit 1 times, last edit by katoda at Sep 27, 2012 8:21:50 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The 4 betas I had were either completed with error or were stuck at 99.415% complete (so I aborted them). Hopefully the next betas will run successfully. I did not go to the task bar to see the percentage of CPU/GPU usage. Also, on that machine I am running Vista.
|
||
|
|