| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 7
|
|
| Author |
|
|
keithhenry
Ace Cruncher Senile old farts of the world ....uh.....uh..... nevermind Joined: Nov 18, 2004 Post Count: 18667 Status: Offline Project Badges:
|
I understand that this manages what the BOINC Wiki refers to as the work buffer. For example, if you set this to 1.5, the BOINC Manager will keep enough workunits on hand to take at least 18 hours to process. Also, once you complete a workunit, your result has to be validated. As I understand, I won't get points for a workunit until at least 3 or 4 other users (a quorum) return the same result. Prior posts on this have said not to set this preference higher than 2. However, in the Wiki under Work Buffer it has this: "If you are only running only one BOINC Powered Project on a computer we suggest a maximum value of 1/2 of the Deadline of that project." Since we have a deadline of 21 days of clock time, the implication is that this preference could be set as high as 10.5. I would think that a given workunit is sent out to more than 3-4 users simply because of things like not all users run 24/7 so it could be several days or a week before a user returns the WU, the user may never return the WU for any number of reasons, the user returns an invalid result for some reason. On the other hand, it would seem that the more users you send a workunit out to, the longer the overall project is going to take so there's a balancing act involved. It may be that a given workunit gets sent to some initial number of users and may be sent to additional users if results don't come in within some timeframe? I've seen references to "hogs", folks that pull down days worth of workunits. It may be that a value beyond 2 for this preference doesn't actually "break" something per se but does adversely impact the running of the project. I can see legitimate situations where a user may not have network access for several days but wants to ensure that they have something to crunch on during that time that they'll return once they have network access again. I've looked at how points are calculated/awarded. Can someone clarify this? Is there a good way for me to balance not having network access for a number of days with trying to avoid idle time not crunching?
---------------------------------------- |
||
|
|
feet1st
Cruncher Joined: Feb 10, 2006 Post Count: 2 Status: Offline Project Badges:
|
Boy you've really summed it all up right there! Let me see if I can clarify some points...
correction: if you set to 1.5 days, BOINC will try to keep 36 hours of work on-hand, not 18 hours. Credit Hogs: Some projects run low on work units. So people that want to participate cannot when the project runs out of work to send. The main objection is to people that download a huge pile of work, and then are unable to process it. It doesn't sound like you fit the profile. Setting frequency to connect: yes, this setting is intended to give you a control on the size of your backlog of work. If you had fulltime network access, and were running several projects, it wouldn't matter if one project went down for a few hours, you'd still be happily crunching away on work from the other projects. And so you could keep your machine fully utilized. If you only do work for one project, and that project has an outage or runs out of work, you may be sitting idle for a time, unless you've got work in your backlog. So, that's why the recommendation changes a bit when you only do work from a single project. If you have 2 days of work on-hand, then the project could be down for a day and you would still be fully utilized doing work. Timeframe of work units: Yes, each work unit comes with a due date ("report deadline" shown in the work tab of BOINC manager). If that piece of work went to 3 users, and 2 reported back the finished results and the third has not reported by the due date, then it will be sent to another user to bring in that third completed result. But it resets the clock on when you can expect your credit for that work unit. And if that user keeps a huge backlog, it might take their machine several day to begin working on it, thus holding up credits for the 2 that already completed it. Limited network access - dial up: yes, if you are dial-up user, you'd probably want to maintain a larger backlog (i.e. define a larger number of days on the connect to network every... setting). That would give you some slack on how often you need to connect to keep your machine busy, and report back your results in time. Changing the "connect to network about every N days" setting: The MAIN LESSON is to make changes to this value very gradually. I mean when I started, I was a dial-up user and set that baby straight to 10 days. I got a pile of work for SETI, and then added Climate too, thus splitting my computer time in a way that BOINC didn't have any way to see coming when it had requested all that SETI work. So, if you change the project "resource shares" or the number of days between connections, you have to keep in mind that BOINC is trying to make estimates and predictions based on the information available at the moment. If WCG is your only project and you say to get 10 days of work... and it does, and then you add another project and cut the resource chare in half... now you've got 20 days of work! Whoops! Also keep in mind that these settings only take effect once your client updates with the project. As you say, setting the preference too high doesn't "break" anything. The systems are all set up to be flexible. It just increases the likelihood of you missing some deadlines, and therefore the crunch time on that work unit was essentially wasted. It also may delay the issuance of credits to others that are working on the same "result" (result is what gets broken out into the redundant "work units"), and this is why the people that covet the credits get upset. Someone with a .1 number of days is basically immediately working on the one work unit they download, and will report back results by the end of the day. But they need to understand that not everyone wants to, or is able to, run their machine that way. Yes, the project has a balancing act between getting more "results" in progress, and getting all of the validation they need. But this balancing act is all based around the deadlines. They basically assume everyone will return results before the deadline, and then if/when they don't then reassign the task to another participant. Ideally (my opinion), you would set things up in this order: 1) keep your number of days to connect at the .1 default, 2) attach to all of the projects you intend to, 3) get your resource shares balanced they way you'd like, 4) and THEN start tinkering with the number of days between network connections. By doing it in this order, you avoid pulling down more work than you will be capable of completing before the deadline. Hope that helps your understanding of how the system works, and your role in it. |
||
|
|
keithhenry
Ace Cruncher Senile old farts of the world ....uh.....uh..... nevermind Joined: Nov 18, 2004 Post Count: 18667 Status: Offline Project Badges:
|
WOW! Thanks for the information. It makes sense that if you are working on multiple projects and you want to have x days of work on hand to crunch at any one time, you will want to set this preference for each project at the same proportional share as your resource share. For three days of work, a project that gets half of your resource share should get 1.5 days for the work buffer setting.
----------------------------------------However, prior posts in the forums here have said not to set this beyond 2 days. Given this discussion and what is in the BOINC Wiki, can one of the WCG techs or admins expand on the reasons why they say not to go beyond 2 days? |
||
|
|
Viktors
Former World Community Grid Tech Joined: Sep 20, 2004 Post Count: 653 Status: Offline Project Badges:
|
I would think the best strategy is to just get the minimum amount of work you would need to span any period of time during which you are not connected to the Internet. If you are continuously connected, then just one or two work units are plenty.
This is because the grid cannot tell if you are still working on a given work unit or have uninstalled the agent. As a work unit ages and if results are not coming back, more agents will eventually be assigned the same work unit so we can get a consensus on the answer. So if a work unit happens to be late in several agent's queues, after some number of days it can look like nobody is working on that work unit anymore and more assignments may be sent out, thus creating inefficiency. This applies equally to the use of UDmonitor. |
||
|
|
keithhenry
Ace Cruncher Senile old farts of the world ....uh.....uh..... nevermind Joined: Nov 18, 2004 Post Count: 18667 Status: Offline Project Badges:
|
Thanks Victors! I am glad to know that going beyond 2 doesn't break something in direct terms. Clearly, we don't want to use an excessive value but if folks can adjust this value to address legitimate circumstances like travel, DSL outages, the rare WCG server outage and the like, that is definitely goodness.
---------------------------------------- |
||
|
|
keithhenry
Ace Cruncher Senile old farts of the world ....uh.....uh..... nevermind Joined: Nov 18, 2004 Post Count: 18667 Status: Offline Project Badges:
|
This is getting rather interesting. I am developing a much better appreciation of why you want to change this value slowly, even if you only run one project on one device, and most particularly if you increase the value. It looks like the exact results one could see will vary depending on your circumstances. If you increase the value quickly, like from 0.1 to 3, you will basically get three days worth of work at about the same time. In other words, I suspect that the WU's you will get in that case are nowhere near as spread out amongst the "pool" of WU's as if you got them one at a time. That could mean that you could end up delaying the quorum and end up getting WU's sent out for more processing unnecessary if you tend to complete and return the WU's slowly. If you are able to return them quickly, you could well end up waiting for others to process them and complete the quorum. That means you could see your points growing a good bit more erratically.
----------------------------------------Off hand, I can't think of a problem with dropping this value quickly but that certainly doesn't mean that there aren't any. It would seem that you want to avoid frequent changes and large changes to this value. What is frequent or large will depend on your specific circumstances too. Also, a subtle point to pay attention to as well, determining how many WU's will be sent to you appears to be based on the estimated completion time (the "To Completion" column on BOINC Manager's Work tab) but this value typically seems to be an upper bound or high end estimate so X amount of work won't necessarily take X amount of time. |
||
|
|
feet1st
Cruncher Joined: Feb 10, 2006 Post Count: 2 Status: Offline Project Badges:
|
Yep. Reducing the value really has no immediate affect, other than kinda forcing you to work through your backlog before you get anymore work.
Exactly on the credits and quorums. And as you say, now all of your work has the same due date, rather than due dates scattered across the range. But, you will find BOINC tends to order work in blocks as well, so you just want to be sure you'll actually be able to get all that work DONE by the due date. As for the estimates. On some machines they're high, on others they're low. But ya, that's the best guess BOINC has at requesting an appropriate amount of work for you. Also, it will try to request 2x the number of days of work you've instructed. I can't recall why they did it that way. That's part of why you say other reccomendation to set N days to no more than 1/2 of the deadline's number of days. Yes, now that you've got 3 days worth of work, your going to average at least 1.5 days before you report results... and longer depending on the ACTUAL time to do the work (as vs. the estimate), and when your next connection to the internet is available. As of today, many of the projects I participate in are down and have been down much of the day (SETI, Protein, Rosetta, and SIMAP have all seen problems and outages today). However, I'm still crunching away, because I have mine set to 3 days as well. And I had enough work in my backlog that I'm still productive during the outages. Also, when I notice outages, I try to be a good citizen, and when I know I don't need work, I suspend network activity (that's under the Commands pulldown menu in BOINC Manager). This keeps my PC from bothering the overcommited servers as they try to get back on their feet. Leaves the bandwidth open for others. And I've ALWAYS got a climate work unit to crunch on if all else fails. So, on the one hand, some might scowel at me for "hording" work units and delaying their instant credit gratification. But, on the other hand, when they are starved for work (due to server outages), and UNABLE to earn more credits, I am able to yield to them (even when I don't notice the outage and suspend network access, statistically, odds are I've already got work and don't need the server). I'm trying to help the science . I don't care about credits.BOINC also tries to learn how often your PC is on, and how much of that time it has a network connection available, and how much of that time BOINC is typically able to run (if you had told it NOT to run while you're using your computer for example, BOINC might not get much done during your business day, even though your machine is on). |
||
|
|
|