| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 12
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello,
please have a look at the following Protocol of a work unit processed in the clean energy Project: Workunit Status Project Name: The Clean Energy Project Created: 11.05.09 Name: E000580_084B_001t2n00x Minimum Quorum: 2 Replication: 2 Result Name App Version Number Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit E000580_ 084B_ 001t2n00x_ 2-- 631 Valid 23.05.09 00:09:28 24.05.09 01:53:30 19.57 390.1 / 306.4 E000580_ 084B_ 001t2n00x_ 0-- 631 Valid 13.05.09 00:19:25 19.05.09 05:55:29 22.17 310.0 / 306.4 E000580_ 084B_ 001t2n00x_ 1-- 631 Valid 13.05.09 00:10:44 23.05.09 02:18:23 39.95 302.7 / 306.4 The point is> The minimum quorum is 2 replications. In reallity, however, the work unit has (in my opinion unnecessarily) been processed three times. The reason for processing it three instead of two times is, that my computer has returned the result 1 hour too late. I understand, that time limits are necessary to gurarantee porper project progress. Nevertheless, the thrid processing of the same work unit could be avoided, if BOIC was modificated. The as is status is the folloing: Finding out, that a necessary result has not been returend in time, the WCC server sends out the same work unit again. The replication is then three instead of two. The third computer processes the whole work unit, even if the second computer, which is in delay with returning the result, gives back the result before the third computer has finished processing the work unit. In my opinion, the one of the next versions of Boinc should contain changes which lead to the following - to be satus -: If minimum replication is two and the second computer does not return the result (valid) in time, it is sent out a third time, just as now. However, if the second client finishes pocessing the work unit before the third one and returns it (valid) to WCC server, the WCC sever gives a signal to the third client to stop processing the workunit . In this way, we have a valid result and the third computer, whose result would be redundand, is free to process another work unit more quickly. I ask for an answer and would be very happy if my propoostion could be considered and programmed in one of the next versions of BOINC. Greetings Yours Martin |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
There's an except: You with late reporting have caused the 3rd computer to waste time so you can help by reducing your cache/additional buffer size.
----------------------------------------The servers immediately set a signal ready for any client with a redundant result not yet returned, but if the client has started such a task BEFORE contacting the server to receive the signal, the task it is let to finish, so all time is credited, in full.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Added:
----------------------------------------Client 6.6.x has a feature to contact a server to check if a task is redundant when it's late already, not started yet and abort them upon confirmation. Who knows will a future server & client software version be informed of a started task so it can hold off sending out extra copies. NB: Not sure of the exact functioning, just a broad outline, the idea though is always to optimize the overall grid efficiency.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at May 24, 2009 10:36:05 AM] |
||
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
Added: Client 6.6.x has a feature to contact a server to check if a task is redundant when it's late already, not started yet and abort them upon confirmation. Who knows will a future server & client software version be informed of a started task so it can hold off sending out extra copies. NB: Not sure of the exact functioning, just a broad outline, the idea though is always to optimize the overall grid efficiency. Hmm, I remember this functionality being discussed, but AFAIK it wasn't added, atleast not at this time. That is included, is: 1; All unstarted work is aborted then passing the deadline. 2; A new function was added to the api, so instead of continuing past the deadline, the application can choose to end early, and send whatever it's already done back to server. #1 works automatically for all projects, while #2 must be added to each application if project wants to use it. ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello,
I have reduced my cache. I had increased it only for an exeptional situation (holidays, internet connection breaking down in absence and unability to get new work units as a consequence. Nevertheless it would be good if redundand work units would be cancelled in the moment when becoming redundand. In order to avoid discomfort of user, it should be displayed as completed and the points earned to the moment when the work unit becomes redundand should be granted, in my opinion. By the way, I am not a total newbee here. Me and kafejka is one and the same person. All the best to you, thaks to alll community advisors, it is a pleasure to cooperate with you. Sometimes, however, it is not to easy to understand your answers Yours Martin Schnellinger |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
No worries and for the cancelling, point 2 of Ingleside's augmentation could create that possibility for WCG... but whether it pleases everyone is a different matter v.v. some who are motivated to squeeze the last second out of 'short supply' work. I though think that WCG in the first instance strives for highest efficiency in the most 'set and forget' possible environment, thus credit for time is due.
----------------------------------------happy crunching.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Steve WCG
Senior Cruncher Joined: May 4, 2009 Post Count: 216 Status: Offline |
First let me state that while I personally would be all for Martin's idea of stopping a crunch as soon as it is no longer relevant, I don't think it is a good idea for grid computing in general. One of the issues is that it turns the idea of crunching into a race with everyone else, which is in direct contradiction to the concept of grid computing, everyone working together towards a commmon goal. Another point is that people with fast machines and small caches would always win (see how quickly the idea of someone returning a WU first equates to winning?). I doubt that the people who, while absolutley dedicated, do not have a monster PC woiuld feel good about *losing*. Do they start dropping off the grid because they don't feel they are contributing? If people start to reduce the size of their cache to play the *who is first* game do we think that the number of idle cycles due to outages (client or server) will be greater than cycles wasted on redundant tasks?
----------------------------------------[Edit 2 times, last edit by Steve WCG at May 24, 2009 4:39:47 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
It's a good point, those total efficiency driven being OK with mid-trip cancellation, not suiting everyone. For everything there's a solution though... an option in the settings that says yes/no to permit auto-abort for applications that have the build in functionality.
----------------------------------------![]()
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Ingleside
Veteran Cruncher Norway Joined: Nov 19, 2005 Post Count: 974 Status: Offline Project Badges:
|
Hello, I have reduced my cache. I had increased it only for an exeptional situation (holidays, internet connection breaking down in absence and unability to get new work units as a consequence. Nevertheless it would be good if redundand work units would be cancelled in the moment when becoming redundand. In order to avoid discomfort of user, it should be displayed as completed and the points earned to the moment when the work unit becomes redundand should be granted, in my opinion. The biggest problem with this is, for the user getting the replacement-wu there's normally no reason for the client to connect before task (nearly) finished, so wouldn't get the abortion-message anyway. Please remember, all connections is initiated by the client, and if server-initiated connections had been possible, very many users would have blocked them or just stopped running WCG at all. For the user reaching the deadline, either uploading whatever already done, or asking for extension on deadline, woudn't give any extra replication so would be improvement. ![]() "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
The task abort code works as follows:
1) If a task is no longer needed (i.e. validation on the workunit has completed and the canonical result has been identified), then an abort if not started message is added to the scheduler request reply the next time the client communicates with the server. This will cause the task to be aborted if it has not yet been started. 2) If the task cannot be used (the workunit is canceled, or if it is so late the workunit has been deleted, etc), then a task abort message is added to the scheduler request reply the next time the client communicates with the server. This will cause the task to be aborted no matter what. There was the intention to add a preference to BOINC at some point that would allow a user to change how #1 above is handled so that it will either behave like a 'abort if not started' or always 'abort'. I do not believe that this has yet been added. The reason that there are two behaviors here is that 'credit' is reward given to members for their contribution. I do not want to argue about how important this is in this thread. Suffice it to say that if you are contributor A and you have a task that goes past its deadline. Contributor B has a computer that is always on and has a short queue and is assigned the additional replica that is sent out for the workunit. Contributor A returns the result 6 hours after the deadline and the workunit validates and the canonical result is archived. Contributor B's computer has been working on the result for 6 hours contacts the server to get more work and finds out that its result is no longer required. If the result is aborted and returned as error (aborts are considered errors), then Contributor B gets no credit for the work they did. This is unfair and so the rules above were implemented. Could something else be implemented? Yes. However, the situation above is not common. Most results that miss their deadline are not ever returned (or are returned long after workunit has been deleted from the system which means that it is over 1 week late). It is much more useful to spend the time working on doing what can be done to ensure that computers can return their results on time. We are working towards providing workunits of different sizes that will be sent to computers that are able to complete them within the allocated time frame. This will do much to avoid the problem in the first place. |
||
|
|
|