| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Locked Total posts in this thread: 53
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
As many of you are aware, there are a few "optimised" versions of BOINC available. You may wonder why we don't all rush out and use the fastest possible client.
The truth is, the science isn't done by BOINC. Reputable optimised clients come with optimised science applications (Truxoft, for example). Others are just designed to inflate the benchmarks and increase points. Even the reputable clients can be used in this way if they are misconfigured. Those that care about keeping score consider this to be cheating. Let's face it: it is cheating. The good news is, WCG and BOINC have safety features built in. For example, the high and low claims in a quorum are thrown away (Rosetta@Home doesn't do this, since they can't use the quorum system). This means that the high claim (from the cheater) is discarded. The cheater gets the average, median credit just like the other people in the quorum. So, can anyone list a few optimised clients and explain how they can be configured to produce a fair benchmark for WCG projects? I think this will avoid ruffling the feathers of those few who feel strongly about a fair credit system, and help inform those who mistakenly believe that trying to game the system is a good thing. Also, should anyone have any new ideas about a fair credit system, please share. Remember, though, that we have had this conversation before. And a word to the overclockers: you deserve credit for every extra work unit you squeeze out of your boxen. You do not deserve extra credit merely for having a fancy cooling system and go-faster stripes. :-p |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The good news is, WCG and BOINC have safety features built in. For example, the high and low claims in a quorum are thrown away (Rosetta@Home doesn't do this, since they can't use the quorum system). This means that the high claim (from the cheater) is discarded. The cheater gets the average, median credit just like the other people in the quorum. I feel the quorum of 3 is a very effective measure against cheating. It does not stop every cheater but it's good enough. However, I think if a quorum of 2 is all that is necessary to validate results with an accuracy acceptable to the research team then it is a dreadful waste of CPU power to use a quorum of 3 just to stop cheating. There is a better way to stop cheaters. So, can anyone list a few optimised clients and explain how they can be configured to produce a fair benchmark for WCG projects? I think this will avoid ruffling the feathers of those few who feel strongly about a fair credit system, and help inform those who mistakenly believe that trying to game the system is a good thing. Yes, I can list a few optimised clients as well as calibrating clients and tell you how they can be configured to be fair. I can also tell you how to configure them to cheat. But I won't say anything because I want the current benchmark based system to be thrown out as soon as possible. Recommending and promoting optimised clients (or even calibrating clients for that matter) is an influence that tends to delay the death of that stupid, useless, ill conceived and poorly implemented dinosaur called the benchmark based credit system. Also, should anyone have any new ideas about a fair credit system, please share. It is evident that someone (either the WCG techs or the research team) can make fairly accurate estimates of the FLOPs required to crunch the WUs. Let them categorise the WUs into 2 or 3 or 4 different sizes (eg. small, medium and large) and award a fixed number of credits for each size. Obviously this would not be 100% accurate but neither is any other proposed system. All proposed credit systems, including the current benchmark based system, rely on the notion that inaccuracies will tend to average out over a large number of WUs. The system I am suggesting will average out the inaccuracy of FLOP estimation as well as any other system. And it is OS independent. Indeed we have and I believe one of the facts that came out of that discussion was that client-side FLOP counting, though it seems like a good idea, would take a huge amount of work to implement, it might turn out to be very inaccurate and it may introduce errors into the results. I would also add that client-side FLOP counting, if it is implemented, will surely be cracked and then the cheating will start all over again. Futhermore, it is OS dependent. So let's forget about that idea once and for all. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Didactylos,
----------------------------------------Rosetta has discarded the method and reverted to a BY-WU predetermined credit. TANPAKU (Alpha) is about to discard the 'first come sets claim height for rest' method, which leaves few to none that allow inflation if there is no quorum >1 rule. I think the ideas were already voiced in other threads like the one u linked to, so i refer to those. ciao
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Sep 6, 2006 7:12:53 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Rosetta discarded the system they recently adopted? And what is a BY-WU predetermined credit?
|
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Dagorath, ......visit Rosetta and grab a decaf.
----------------------------------------
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Sep 6, 2006 7:41:50 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
FLOP counting was my suggestion, and I think I gave the cons as well as the pros. Despite the disadvantages, I see it as the only totally fair method. But for it to work, you still need a quorum. This allows you to include the FLOP count in the validation process, and discard any fake claims. This will actually give ZERO to cheaters. While there may be a few ways they could game the system to get a rare massive score (if all three quorum members are in on the game), the frequent "nil points" would discourage that.
Adding a FLOP estimation is difficult, but not impossible - although it is going to depend entirely on the project. It's not a task for code monkeys, but fortunately WCG employ nothing but code poets. I'm convinced it is possible. Not by counting every instruction, you understand, but by counting discrete chunks or function points. For example, suppose Rosetta has (I'm completely making this up) a test for total folding energy that it runs in 10 different positions for each flex of the protein... and each arm is flexed until it meets a threshold or until a maximum is reached. There will be a counter already, to prevent it from going over the maximum, so you just multiply them all together to get an estimate. This is textbook stuff. Once you have an estimate, you can validate it by correlating the estimate with the actual CPU time on a known configuration. Finally, you can convert it into Cobblestones (actually a completely precise conversion, since the Cobblestone is based on a theoretical ideal reference computer performing at 1000 MIPS). And on the quorum theme: the quorum is guaranteed at 3 by WCG for the project scientists' benefit. This is purely about scientific rigour, not about "acceptable accuracy". Without a way of validating the FLOP claim, I see no hope for quorum-less projects. A fixed credit eliminates credit-hounds, but not cheats. Cheaters can send in any old garbage in exchange for points. Sad but true. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Didactylos,
Perhaps I misinterpreted your post(s) in that previous thread regarding client-side FLOP counting. I gathered you were not in favor of it. Thank you for opening the discussion again. I've never been thrilled with fixed credits but I have supported it just because it seemed better than anything else including client-side FLOP counting. Now your explanation of how FLOP counting could work speaks to me in terms I can understand and evaluate. I think it will work!! If the code poets at WCG can find the time to incorporate FLOP counting into the science app(s) (or would it fold into BOINC core?) then I'll be more than happy to help test it. If it works it will be far better than fixed credits. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Okay, took the bull by the horns, hooked up and processed a few at R@H. The end result was, that my always Stock benchmarked machine Stock WOS/Hardware, now on 5.6.0, has claimed only fractions less of the credit awards. Its next to exact money for work performed, however fast done....with a small tip.
----------------------------------------![]() Researched the R@H New BY-WU credit / Sliding Averaging Method, ....the algorithm they've come up with is close to non-plus-ultra ... better than lowest / highest out of WCG.....why is it that virtually always when crunching HDC i get equal or more, sometimes significant (we know the various explanations), and less on FAAH.... think latter quorum is often determined by lesser spec'd machines, which are much easier impacted during benchmarking when other activity is ongoing simultaneously...... So before i said, i'd nothing to add, well why re-invent the wheel if a solution is on offer......talk to Dr.Baker.....Mr PF. This is barring of course if the FAAH crunches particularly as a longer term project or HDC's are fundamentally different in build....then Flop counting is maybe the better thing. As an important ingredient to those with Linux whose Whetstone's for to me unknown reason are drastically lower on benchmark, the R@H system does not favour any OS and from testimony over there, the Torvald fans are happier than before. But the overall general results I have seen is that (with the new credit system) the performance per core per clock-frequency is similar enough to not say that Windows or Linux is significantly different. As tralala pointed out (and I have in another post) pointed out that Linux benchmarks are quite different from the Windows ones, but the code in Rosetta is pretty similar between Linux and Windows, so the performance difference will be small. -- Mats, Message 26380 - Posted 8 Sep 2006 17:01:01 UTC ciao PS, the R@H progress profile is not much different from what we see at WCG, checkpoints and saves at material distances!
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The new R@H system is a vast improvement over what they had. But when you consider they had nothing then any improvement can seem like a big improvement. The new R@H system has solved only the Linux-Windows disparity. It provides no deterrent to cheating and it still rewards cheating.
The mistake they made is including a claim in the new running average and then awarding the new running average to the claimer. For example, there are claims of 25, 42 and 35. The running average is now 34. Now I claim 2000. The new running average is 525.5 so I receive 525.5. See, it doesn't stop the cheating. There is no deterrent and there is still a reward. A better way is to give the new claimer the current running average. Then use his claim to calculate the new running average and give the new running average to the next claimer. That way if I cheat I will benefit the next claimer but not myself. If my goal is to win the points race then cheating in that type of system hurts me because it helps my opponents. That's a deterent and it will stop the cheaters. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Does not work quite like that, as there is a 'decoy' (target i suppose) reference for the varying WU's.
----------------------------------------
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Sep 9, 2006 12:11:26 PM] |
||
|
|
|