| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 18
|
|
| Author |
|
|
ericinboston
Senior Cruncher Joined: Jan 12, 2010 Post Count: 265 Status: Offline Project Badges:
|
2 Questions:
----------------------------------------I'm wondering if Projects create larger/harder WUs over time to compensate for when CPUs get faster. For example, if a Cancer WU is 50MB in 2013, does the Cancer Project make the WU larger/harder/etc. (such as 100MB) in 2018 to get better/more detailed/whatever results? A side question: Are all a particular Project's WUs the same size/complexity for every Volunteer? Or do Volunteers with crusty old Pentium chips get easier WUs to crunch than someone with the latest and fastest i7? If the i7 Volunteer takes 3 hours to crunch that WU, then I would imagine the Pentium person would take days or weeks and hence would not complete the WU in time. Thanks in advance for the answers! ![]() |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
There's no custom sizing according to hardware power, only an effort to set a duration target, so that all platforms can participate. E.g. because Android gets to work on the AD Vina driven sciences, this platform is putting a constraint on how big tasks can be sized (which is not based on MB, rather on runtime). Yes, dynamic sizing to power would be something on the tech wishlist, if only they knew up front the FPOPS needed for any given molecule or protein target. They don't and time and again find complete chaos.
No, far as observations go there's generally no increase of runtime as the project progresses. Maybe CEP2 was the exception, but then when a new experiment started, it would reset again. For the 'making it up as they go' department: The performance of WCG is completely unhinged from Moore's law... computing power averages here barely increases 5-10% per annum, nothing close to the doubling every 2 years. FYI for any interestee: A new config tag will be added, which allows clients to compute the estimated runtimes purely based on benchmark. IMNSHO, if WCG keeps smoothing and plugging the running FPOPS averages into new work, the estimated runtimes will still be off off. The tag is only of value if there would be truly a reasonable accurate FPOPS on a per-task basis. So far, just a ... dream scheduler: add <rte_no_stats> config flag to estimate job runtime witout stats The scheduler estimates job runtime based on statistics of past jobs for this (host, app version). This doesn't work well if the distribution of runtimes is very wide, as may be the case of universal apps. If this flag is set, runtime estimation is based solely on CPU/GPU peak FLOPS and job FLOPs estimate. Not sure how this all is going to work, considering that WCG also runs with the <dont_use_dcf/> control, disabling a client of adjusting estimated runtime based on true performance. |
||
|
|
wolfman1360
Senior Cruncher Canada Joined: Jan 17, 2016 Post Count: 176 Status: Offline Project Badges:
|
I've been curious about this, actually.
----------------------------------------How exactly does boinc estimate initial progress when you first install? Based on benchmarks of similar processors to what you're using? And then it slowly (maybe over the course of 10 WUs gets slightly better about figuring out exact runtimes?) Though is never really quite exact if runtimes vary widely, as said earlier. I can imagine the FAH2 is going to be similar since there will be WU's that take 15 hours and some that take less than 3.
Crunching for the betterment of human kind and the canines who will always be our best friends.
AWOU! |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
BOINC has an initial 30 second Benchmark test split in Float and Integer, which is then summed. This benchmark is repeated every client restart, but not sooner than 4 days after the previous (IIRC also after each client up/downgrade). The re-benchmarking can be disabled (Read config manual), which I've done as I consider it a waste of time with how I operate. The client is supposed to learn, but given that the noted DCF (duration Correction factor) control has been hobbled, locked to a value of 1, that learning is limited. Beyond that, a task has a header with an estimated FPOPS total which then is used to compute an estimated runtime (TTC or Time To Complete) in amongst using the benchmark. As noted, WCG uses the actual runtimes on many returned results to then stick their average runtime FPOPS into new work headers.
----------------------------------------Since there's a time delay between receiving work with a X FPOPS and the actual FPOPS in the reported task, and the feeder build pipe which can be several days deep, not to forget the on-client buffer size, that the FPOPS what's stuck in the new work headers is lagging reality, so we get work that can have an estimated runtime of 3 hours, but then runs 13 hours or vice versa. Server side there are some 'learning' rules that relate to compounds and proteins targets, but they have to be tech developed.... nothing remotely what one could consider AI. Short for, it will hardly ever be right, subject to change without notice, except for processes such as with MCM which produce pretty stable runtimes. [Edit 1 times, last edit by SekeRob* at Nov 3, 2017 6:11:29 PM] |
||
|
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges:
|
There's no custom sizing according to hardware power, only an effort to set a duration target, so that all platforms can participate. E.g. because Android gets to work on the AD Vina driven sciences, this platform is putting a constraint on how big tasks can be sized (which is not based on MB, rather on runtime). Yes, dynamic sizing to power would be something on the tech wishlist, if only they knew up front the FPOPS needed for any given molecule or protein target. They don't and time and again find complete chaos. No, far as observations go there's generally no increase of runtime as the project progresses. Maybe CEP2 was the exception, but then when a new experiment started, it would reset again. For the 'making it up as they go' department: The performance of WCG is completely unhinged from Moore's law... computing power averages here barely increases 5-10% per annum, nothing close to the doubling every 2 years. As I do recall, some AutoDock & Vina WUs were expanded, but it was several years after the science has started. It was just overwhelming for a WCG servers to get so quickly all those results back! & quoting Moore's law in a way of grid computing, which is binded by "law of average", shows only not understanding of the Moore law at all! ![]() |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
QED
ROFL |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If you had one device crunching one project only, would the amount of work accomplished tend to level out over time or be highly variable?
|
||
|
|
wolfman1360
Senior Cruncher Canada Joined: Jan 17, 2016 Post Count: 176 Status: Offline Project Badges:
|
BOINC has an initial 30 second Benchmark test split in Float and Integer, which is then summed. This benchmark is repeated every client restart, but not sooner than 4 days after the previous (IIRC also after each client up/downgrade). The re-benchmarking can be disabled (Read config manual), which I've done as I consider it a waste of time with how I operate. The client is supposed to learn, but given that the noted DCF (duration Correction factor) control has been hobbled, locked to a value of 1, that learning is limited. Beyond that, a task has a header with an estimated FPOPS total which then is used to compute an estimated runtime (TTC or Time To Complete) in amongst using the benchmark. As noted, WCG uses the actual runtimes on many returned results to then stick their average runtime FPOPS into new work headers. Since there's a time delay between receiving work with a X FPOPS and the actual FPOPS in the reported task, and the feeder build pipe which can be several days deep, not to forget the on-client buffer size, that the FPOPS what's stuck in the new work headers is lagging reality, so we get work that can have an estimated runtime of 3 hours, but then runs 13 hours or vice versa. Server side there are some 'learning' rules that relate to compounds and proteins targets, but they have to be tech developed.... nothing remotely what one could consider AI. Short for, it will hardly ever be right, subject to change without notice, except for processes such as with MCM which produce pretty stable runtimes. Thank you for that excellent explanation. I notice that SCC has wildly varying runtimes for different WUs. I guess different batches/science for shorter vs. longer ones? Also. What, exactly, do quorums do here? I take it if I am quorum number 1 I'm the first machine with this task and my result won't be verified or validated until number 2 gets it and processes? Similarly, if I'm number 2 (or however many, I'm guessing there is only two?) And finally, is this other quorum also called the wingman? I hear this terminology and am just trying to understand a little more. thanks!
Crunching for the betterment of human kind and the canines who will always be our best friends.
AWOU! |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
If you had one device crunching one project only, would the amount of work accomplished tend to level out over time or be highly variable? If would not really matter if crunching 1 science or a mix at WCG. Over time the daily AVERAGE will flatten. Suppose your 8 threaded machine runs 24/24, and is 95% efficient (5% lost to other processes, you browsing the web, posting questions to the forums, etc). Ignoring validation delays by wingman when quorum 2 is required, you would see an approximate 8 * 24 * .95 = 182.4 hours daily AVERAGE after probably a week. The subject size of the WU is really irrelevant to what ends up showing in your stats in runrtime. |
||
|
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Thank you for that excellent explanation. I notice that SCC has wildly varying runtimes for different WUs. I guess different batches/science for shorter vs. longer ones? Also. What, exactly, do quorums do here? I take it if I am quorum number 1 I'm the first machine with this task and my result won't be verified or validated until number 2 gets it and processes? Similarly, if I'm number 2 (or however many, I'm guessing there is only two?) And finally, is this other quorum also called the wingman? I hear this terminology and am just trying to understand a little more. thanks! Quorum is an indicator of how many copies need to be cross validated for a task, starting with suffix _0, _1 etc. 'Quorum 1' is really silly, better referred to a Zero Redundant. Since quorum 2 goes to random clients, the time needed to have both come back varies. Very old stats, it takes about 2 days to get 95% validated, then another 2 days to hit the 99%+ mark, and the rest can take 7-10-14 days before a match is found. Sometimes is a 3rd copy (suffix _2) needed to determine if copy _0 or _1 is valid, if they did not agree. We are mostly dealing here with Non-Deterministic computing. The target complexity, the molecule size/shape, the energy needed to perform a dock, very much influence how long a calculation can take. Typically in bio, a lowest energy dock has the highest interest, at which point the calculation moves on to the next step, or ends the task or step if a 'wanted minimum lowest energy' is not achieved. Based on the returned data in the beta and initial project phase, will the techs determine how much work can be packed in a task and not blow the patience fuses in your head. |
||
|
|
|