| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 38
|
|
| Author |
|
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges:
|
These run-time-calculation/initialization problems may have been fixed with 6.12.26. Thought they might have been limited to GPU tasks too?
Straying a bit Off Topic but, When posting problems it's important to include the operating system, system spec, Boinc version (especially when using a beta), tasks being run (both here and at other projects). It also helps to post the install type, what AntiVirus package you use and any intensive programs you use - Most of this can be found in the first page of Boincs Event log, and can be easily copied and pasted into your post. Some recent (6.12.33 or later) cc_config.xml log flag additions can also be of use in resolving/narrowing down problems, especially testing new Boinc versions and Beta testing apps,
<http_transfer_timeout_bps> |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello KerSamson
Reference: KerSamson [Jun 13, 2011 9:19:09 AM] post Greetings. For my ASUS M4A88T-M LE motherboard, I had BIOS_v0306 which is posted on the ASUS website as having a date of publication (or availability) of -- 2011.03.18. There is a newer version, BIOS_v0307, which I have seen just now (2011.07.05Tu.0915.UTC), with a posted date of publication (or availability) of -- 2011.06.21. Where BIOS_v0306 is dubbed at the said website as "Improve system stability", the newer BIOS_v0307 is dubbed as "Improve Memory compatibility". I'll first download the BIOS_v0307 and next install it, and from there I'll try out some CMD2 WUs and take the setup for a spin. I'll post the results. Thanks for the tip. Good day ; |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
You're welcome !
---------------------------------------- Yves |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
So, I did flush the old BIOS_v0306, and next installed the latestVersion, BIOS_v0307. I then downloaded some CMD2 WUs, and then started crunching them CMD2 WUs. As before, I'm doing exclusiveProject per machine; in this case, an all-CMD2 crunching in a 6-core machine. (I figured that exclusiveCrunching would sit well with DCF calculations)
Some validation results are now in (asOf_2011.07.07Th.1545): a] pendingValidation (WingWU=inProgress): 17 b] valid (WingWU=valid): 6 c] inconclusive: 1 (1stWingWU=inconclusive; 2ndWingWU=inProgress) The above data shows only 1-inconclusive out of 7-evaluated so that the worstCase failRate at the moment should not be greater than 1/7 or about 14.28%. Acceptable? Not bad, I'd say, although I still don't want inconclusives for any of my crunched WUs. Now, of the six(6) valid CMD2 WUs, the wingWU almost always gets a higher points regardless of whether the wingWU took shorter or longer timeToComplete than my WU. Also, my WU almost always overclaims (up to 2.6x) relative to the wingWU. All these things shake out (in the over-all) so that my WU ends up almost always getting the short end of the quorum. Is it just the luck of the draw? WCG, please look into this. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Ja wohl, zur gehorsamst! (Soldat Schweig).
1. DCF for HCMD2 can be all over the place because of the size of the task, parent, child, grandchild etc, so running it as exclusive will give little improvement. 2. HCMD2 Points are awarded based on actual positions completed by each of the 2 wingman in a quorum, when either or both hit the 6 or 12 hour stop sign (in nutshell rule). --//-- |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
David_L6:
If the machine(s) throwing errors are overclocked, try backing off on the overclock a little or return all settings back to default for a while and see if that helps. Good advise. One thing I have learned is that the CMD2_v6.40 WUs that I'm getting for my AMD-Ubuntu machine does not seem to do as well on overclock compared to how HFCC_v6.40, HCC_v6.40 or C4CW_v6.40 did. It is BOINC that first exhibits weirdness (flickering fonts, lines, GUI elements) before the CMD2 WU does (CMD2 progressPercentage swinging wildly, for example, from 10% to 180% every second). By the time I detect the BOINC weird display, the damage may have already been done: the inProgress CMD2 WU may have been already doomed to fail the coming validation as first an inconclusive WU and then the fatal INVALID at the end of the WCG validation. My Intel-Vista machine seems better at handling overclocked situations if the fewer CMD2 faults on that machine is any indication.knreed: I cannot tell you why your computer is computing nan's instead of a proper value Beats me as well; too high an overclock is looking to be the culprit. In any case, how feasible is the idea of inserting a routine that stops all calculations if a NaN (for any reason, not just from tooHighOverclock) is encountered?robertmiles: I've read that some of the 6.12.* versions of BOINC fail to ininialize one of the parameters many BOINC projects use in calculating the maximum run time, and therefore often give wildly incorrect limits for the maximum run time. Therefore, anyone seeing a maximum runtime exceeded problem may want to mention which version of BOINC they saw the problem under. Hmm... I do get the weird BOINC display almost as a alarm that a CMD2 WU is about to fail (if not in fact already failed), but the CMD2 calculations seem to continue instead of exiting. An NaN escapeHatch routine would have come handy and provided a means for an automatic and elegant program exit?A few other BOINC projects have people reporting similar problems under some of the 6.10.* versions of BOINC also; no clear reason why identified yet, though. ; |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Snipped pieces:
knreed: I cannot tell you why your computer is computing nan's instead of a proper value Beats me as well; too high an overclock is looking to be the culprit. In any case, how feasible is the idea of inserting a routine that stops all calculations if a NaN (for any reason, not just from tooHighOverclock) is encountered?... An NaN escapeHatch routine would have come handy and provided a means for an automatic and elegant program exit? ; As per the knreed post about the present ''true'' error for HCMD2, which I work out to be 0.2%, and the bulk of that being the zero status/no heartbeat, AND the comment he's not planning action, I'd say a 0.0 chance on any scale. It's a first [unique] observation, so it would not be tested and adding code to capture such a state in initial development or now would only introduce a new vector of program failure. Then, we're just 6 months away from finishing this multi-year project, so not expecting the horse will get a hoof check so close to the finish line. Overclocking is not often mentioned, but sciences are tested/developed on stock configured devices. That's all there's time for I'd think. --//-- |
||
|
|
Mysteron347
Senior Cruncher Australia Joined: Apr 28, 2007 Post Count: 179 Status: Offline Project Badges:
|
An NaN escapeHatch routine would have come handy and provided a means for an automatic and elegant program exit? ; You'd not be alone looking for an escape hatch from the Nanny state.... |
||
|
|
|