| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 26
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Some betas were sent out about 11 hours ago. One of my machines which runs 100% on WCG got a few of them. Another which is set to run 90% CPU load was rejected:
"17-May-2008 03:13:16 [World Community Grid] Message from server: (won't finish in time) Computer on 99.8% of time, BOINC on 96.3% of that" Disallowing machines that will take an extra 6 mins to process them seems an unnecessary restriction to me. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
The Scheduler Server Daemon will queue up an extra copy for sending right after the moment that a job is required to come back which in that case was 6 minutes too late. On 250,000 work units per day you can calculate how many go out per second to member machines.
----------------------------------------Where should a delay be set.... 1 minute, 5 minutes, 15 minutes (if that were at all possible)? The Scheduler acts on any Result interaction, 250,000 per day, so if it comes back with e.g. 'Error', 'Abort' or is deemed 'No Reply' it's not waiting further to send a new one out as all is an integrated stream.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Your comments make no sense.
The most recent beta that one of my machines did had 19 WUs sent out starting at 05/16/2008 18:31:13 and ending at 05/16/2008 19:20:11. WUs took up to 3 hours to be returned. For the previous one, WUs were sent out from 05/16/2008 17:13:03 to 05/16/2008 18:23:52. Clearly, a reliable machine running at 90% CPU would return results more quickly than one which takes 3 hours and, apart from that, it would make absolutely no difference anyhow unless it happened to be doing one of the last few given out, as they were given out over a period of about an hour. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Kremmen, you misunderstood the message.
It doesn't matter, though. WCG reserve the right to send beta work out to whichever machines they want, if the profile allows it. Your complaint has been noted, and there is nothing more that needs to be said. Thank you. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Didactylos, if you believe I've misunderstood, then I suggest you re-read what I wrote. I've just reported what I observed. Over a period of some hours, multiple beta WUs were received by a machine set to run at 100% CPU and multiple rejections by a similar machine set to run at 90% CPU.
It seems to me that this is useful to mention, especially as there is nothing on the site which says "If you don't set your CPU to 100%, you won't get any beta work units". |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've just reported what I observed. Not so, you tried to interpret it, as well. Nothing wrong with doing that, of course - but you jumped to the wrong conclusion, and blamed WCG. If you want to interpret what you see correctly, then feel free to research it. In this case, DCF is the most important factor in determining if a task will complete in time. If you don't want to research it, that's perfectly fine. The message is informational only, and you can ignore it happily. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
To simplify what I said, the Scheduler determined that it needed the job back by say 19:00. The Server to Client interaction determined that your 90% machine would send it back at 19:06. That set the next reply: Cant make it for risk of missing deadline, therefore the host is not considered for the available work, at this moment in time.
----------------------------------------
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at May 17, 2008 10:48:06 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
knreed has been talking about doing some programming for the Beta scheduler policy. He thinks that the early Beta tests for Rice did not catch as many errors as they should have and thinks that a special policy for Beta could spread the work units out to a larger variety of systems.
That sounds about right to me, but I also think that we are soon going to be busy reworking the innards of the server programs when we get rid of UD and try to fully implement BAM. So I do not expect immediate changes. But when the time is available, we should have a Beta scheduling policy that is very different from the standard project scheduling algorithm. We should have Beta WUs running on a vast number of different systems, rather than a lot running on a few systems. knreed has put this on the agenda, so we are taking notes on suggestions made. Obviously, special attention will be paid to suggestions that have an easy way to implement them. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Not so, you tried to interpret it, as well. Nothing wrong with doing that, of course - but you jumped to the wrong conclusion, and blamed WCG. I didn't blame anyone. I pointed out an observed situation.However, if you want to get into blaming: WCG gets us to use boinc. If boinc is buggy (and the huge seemingly random changes in DCF are what I'd call buggy behaviour) and WCG doesn't bother to tell us about the bugs, or how to fix them, whose fault is that? If boinc's error message is irrelevant to the error, whose fault is that? If boinc changes values of important variables in a manner which is, to the observer, random, whose fault is that? Certainly not mine. We're here to run WUs, not to have to do hours of "research" every time boinc acts stupidly. If you want to interpret what you see correctly, then feel free to research it. In this case, DCF is the most important factor in determining if a task will complete in time. If all you're ever going to say is the equivalent of "go work it out for yourself from the total lack of documentation and given this tiny hint as to one variable that changes randomly for no obvious reason,", then it would be better if you just didn't say anything at all. Or, you know, you could actually say something useful, like telling us what exactly the relationship of DCF is to the situation and what we should set DCF to to avoid this happening or how to get boinc not to set it incorrectly. (... if that is even possible!) FWIW: 1) The DCF of the machine that was successfully obtaining beta WUs is currently higher than the one that didn't. 2) These beta WUs all ran for exactly an hour, just the same way as standard rice WUs run for exactly 8 hours. Why would DCF make any difference at all? [Edit 1 times, last edit by Former Member at May 17, 2008 3:01:37 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
This is not an error.
|
||
|
|
|