| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 8
|
|
| Author |
|
|
Country Bumkin
Cruncher Australia Joined: May 14, 2008 Post Count: 14 Status: Offline Project Badges:
|
I have been finding that the CPU benchmark disrupts GPU processing. When the benchmark runs (either automatically or manually activated) all jobs are suspended, and when the benchmark has completed all jobs appear to resume progress but when the GPU job gets to 49.707% or 99.707% the normal short pause continues indefinately. If the system is re-booted and the frozen job was at 99.707% it completes immediately BOINC starts and the result files are uploaded, if I manually suspend/resume the job it falls back to the last checkpoint and then processes to a normal completion.
----------------------------------------If I use app_config to run more GPU tasks all are permanently paused by the CPU benchmark. I have not found any log or error messages which suggest a problem. CPU jobs are all FAAH. Operating System is Linux Mint Maya 13, 64 Bit Mon 31 Dec 2012 13:35:48 EST||Starting BOINC client version 7.0.42 for x86_64-pc-linux-gnu app_config is.... <app_config>
Regards C Bumkin
----------------------------------------[Edit 1 times, last edit by Country Bumkin at Dec 31, 2012 5:29:32 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello Country Bumkin,
Interesting. I have reported this to the staff. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
About two hours ago, I just saw unfold before my eyes a BOINC_v7.0.42-intiated CPU-benchmark. Two HCC7.08-GPU-WUs were running. At the time when the benchmark completed, the WU that was (or caught) at 99.707% had nothing indicated for it under the BOINC remaining-time column and appeared frozen, while the one that was less than 99.707% crunched along seemingly unaffected and completed successfully. The frozen WU's elapsed-time kept incrementing by the second. I gave that WU some time and when that WU reached 30-minutes elapsed-time, I opted to suspend and then resume. On resumption, the frozen WU grabbed the thread earlier released at suspension from another HCCv7.08-GPU-WU, then marked that WU as 'Waiting to run', and next used the thread: it then ran back to the start and eventually completed its WU run.
Another instance calling attention to the need for a mechanism to handle interrupted GPU-WUs. One method is for interrupting processes to wait for all running GPU-WUs to first complete, and temporarily hold-off all Ready-to-Start GPU-WUs from running. This method would also protect GPU-WUs from high-priority GPU-WUs. ; ; andzgridPost#782 ; |
||
|
|
stoneageman
Advanced Cruncher UK Joined: Nov 21, 2005 Post Count: 104 Status: Offline Project Badges:
|
I've been using Boinctasks rules feature to monitor for over running wu's, then perform a snooze for a few seconds, which restarts them all properly. Had not realised it was the benchmark causing this problem
---------------------------------------- so to CB |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
For those that actually read manuals to see if there's a working solution: Insert in the <options> section of the cc_config.xml the following line:
<skip_cpu_benchmarks>1</skip_cpu_benchmarks> This will forever postpone any periodic retesting of performance, in fact the manual benchmark test will not execute (MS would call that possible bug a feature). Benchmarks are a waste of time, in fact, WCG and any other project that has adopted server 700 and up ignores [CPU] benchmarks based claims anyhow. Claims are computed from returned work throughput statistics, then adjusted a second time if it involves e.g. a quorum 2 science by whatever standard or legacy rules that WCG put in place. Benchmarks used to run every 120 hours. As of some client version 6.xx only when upgrading or restarting, if the restart was longer than 120 hours ago, grosso modo. Not tracking the rules that force them since as said, it's to me a redundant legacy function only of use to projects pré server 700, and on top many of those just give dang anyhow... they make up their own credit awards. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
For those that actually read manuals to see if there's a working solution: ... That's a nice way to introduce your ideas... Or is it? ![]() ; ; andzgridPost#791 ; |
||
|
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges:
|
I will look into this. For anyone experiencing this issue is it on both Windows and Linux?
Thanks, armstrdj |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
For those that actually read manuals to see if there's a working solution: ... That's a nice way to introduce your ideas... Or is it? ![]() ; ; andzgridPost#791 ; It's a working function of BOINC, for those understanding and interested in what's already on offer, *not* an introduction of my ideas to go over-head ![]() |
||
|
|
|