Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 102
|
![]() |
Author |
|
Aurum
Master Cruncher The Great Basin Joined: Dec 24, 2017 Post Count: 2387 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I don't know if it was covered above but as I keep 10 days projects active then the ARP WU takes 10 days to rise to the top and then 2 days to crunch. If all that is happening is that the WU is discarded because it is too late then I feel that I am wasting my time with ARP which is a shame. At this moment my Results Status says that an ARP WU is No Reply even though it was due today and will complete before the end of the day. RTS48, You're slowing down the project and causing your own problem by using a 10 day queue. I recommend a work queue of 0.5/0.01 to 1.0/0.01. There's no shortage of ARP WUs. They cannot proceed until they get them back and they're validated by 2 clients.Here's an example: https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=324305413 Result Name OS type OS version App Version Number Status Sent Time Time Due / On October 9th it goes out to 2 clients. The first returns it in 4 days and when the second has not returned it after 7 days it gets sent to me and I return it in 2 days. All three of us get the points. Which one of us wasted their time & electricity & could've been doing other useful work??? Me. Did the man in middle need 8 days to run the WU or sit on it for days before beginning?Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit ARP1_ 0013477_ 029_ 2-- Linux Linuxmint Linux Mint 20 [5.4.0-48-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)] 727 Valid 10/16/20 23:47:59 10/18/20 01:09:39 16.58 534.7 / 602.8 ARP1_ 0013477_ 029_ 0-- Linux 4.4.0-87-generic 727 Valid 10/9/20 23:44:55 10/17/20 15:43:58 16.39 590.0 / 602.8 ARP1_ 0013477_ 029_ 1-- Linux Ubuntu Ubuntu 20.04 LTS [5.4.0-47-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)] 727 Valid 10/9/20 23:42:20 10/13/20 02:56:08 22.52 615.6 / 602.8 I think they should tighten this up to 3 days and after the confirmation send a server abort to halt useless work. ![]() ![]() [Edit 1 times, last edit by Aurum420 at Oct 18, 2020 4:57:00 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
They cannot proceed until they get them back and they're validated by 2 clients. Is this stated somewhere by the project team? It doesn't seem right to me. The WRF is a forecast model and almost by definition won't generate a totally accurate forecast. Any model biases or feedback problems would surely propagate to the next run. This is the old "garbage in, garbage out" problem. Why wouldn't they use observational data from The Weather Channel as input to each run? If bad data is propagated over all 182 or 183 iterations, it's almost certain not to resemble reality at the end. |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12439 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Rod
The cache should NEVER exceed the lower of 10 days minus the time your longest units take or the shortest deadlines that you download. In the case of arp which take you 2 days and have a deadline of 7 days, that means the lower of 10 minus 2 or 7, so 7 days maximum. Any more than that and you are risking your units being ruled out as too late. You then have the issue of your machine being considered 'reliable' so able to get re-sends. That needs you to regularly return within about 3 days. I would suggest that you reduce your settings to not more than 2 plus 1. Anything more should only be for when units are in short supply which hasn't occurred since opn came online. arp should also be restricted to half your threads using app_config, with other projects taking up the rest. That is because checkpointing and uploading takes a lot of computer time with arp. Mike |
||
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12439 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
entity
----------------------------------------It is an iterative process so they don't send out the next unit for that patch of Africa until they have received the preceding on. However, they are working over a year behind the actual weather so they can compare the results. Whether they are adjusting or not has not been promulgated. Mike [Edit 1 times, last edit by Mike.Gibson at Oct 18, 2020 6:46:17 PM] |
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1679 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I don't know if it was covered above but as I keep 10 days projects active then the ARP WU takes 10 days to rise to the top and then 2 days to crunch. Please reduce the buffer size of your machines to 1 or max 2 days! Especially with ARP1, you are disturbing many members as well as WCG it-self, and the scientists, returning WU far too slowly. You do not have any advantage from a large buffer and you waste your time and the time of other contributors. Since a couple of months, I was surprised to have to notice that regularly some of the ARP1 WUs require significantly more than 7 days for becoming validated. It is nothing related to Murphy, it is only people with far too big buffer/queue. Cheers, Yves |
||
|
RTS48
Veteran Cruncher Bolivia Joined: Aug 2, 2009 Post Count: 1350 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you all for your replies to my post. I consider myself admonished and have reduced my WU store accordingly to days and 1 day extra maximum.
----------------------------------------In my defence, I have never seen explanations regarding the effect of a large buffer on project execution before. I would feel that it needs to be a pre requisite understanding before one is allowed to sign up for certain projects. Up until recently I have had a rather unreliable internet which had the habit of shutting off for several days on end and that is why I kept a large buffer - I have now put that right.
Rod Peel
Santa Cruz Bolivia South America ![]() ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It is an iterative process so they don't send out the next unit for that patch of Africa until they have received the preceding on. Once again, has that been stated somewhere by the researchers or that just some member's assumption? |
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1679 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
@entity,
----------------------------------------the various information related to the project way of run as well as to the requirements are disseminated over multiple messages. Indeed, you point out a really important aspect of some communication mistakes. ARP1 and FA@H2 have some particularities in terms of deadline. ARP1 and MIP1 have some "non-affinities". Mostly Keith and Kevin usually provide some background information ... from time to time. Surely, it would be fine to make available a kind of data sheet for each project, summarizing the supporting requirements (machine, Internet connexion, listing the main project characteristics), (non-)affinity with other sciences, etc. Cheers, Yves |
||
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1679 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you all for your replies to my post. I consider myself admonished and have reduced my WU store accordingly to days and 1 day extra maximum. Thank you very much RTS48 ![]() Cheers, Yves ---------------------------------------- [Edit 1 times, last edit by KerSamson at Oct 20, 2020 7:33:14 AM] |
||
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2174 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Here's an example: https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=324305413 Result Name OS type OS version App Version Number Status Sent Time Time Due / On October 9th it goes out to 2 clients. The first returns it in 4 days and when the second has not returned it after 7 days it gets sent to me and I return it in 2 days. All three of us get the points. Which one of us wasted their time & electricity & could've been doing other useful work??? Me. Did the man in middle need 8 days to run the WU or sit on it for days before beginning?Return Time CPU Time / Elapsed Time (hours) Claimed/ Granted BOINC Credit ARP1_ 0013477_ 029_ 2-- Linux Linuxmint Linux Mint 20 [5.4.0-48-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.1)] 727 Valid 10/16/20 23:47:59 10/18/20 01:09:39 16.58 534.7 / 602.8 ARP1_ 0013477_ 029_ 0-- Linux 4.4.0-87-generic 727 Valid 10/9/20 23:44:55 10/17/20 15:43:58 16.39 590.0 / 602.8 ARP1_ 0013477_ 029_ 1-- Linux Ubuntu Ubuntu 20.04 LTS [5.4.0-47-generic|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9)] 727 Valid 10/9/20 23:42:20 10/13/20 02:56:08 22.52 615.6 / 602.8 I think they should tighten this up to 3 days and after the confirmation send a server abort to halt useless work. Here's what I do to format the visual webpage result of a workunit, because we know it will vanish after some time: (Example taken from https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=324305413) Firstly, I copy the result lines on the webpage to the 'clipboard' by selecting these lines on that webpage:
Next, I paste the contents of the clipboard to a file: (on the command line: cat > FILE), (then paste the contents of the clipboard, e.g. by typing CTRL+SHIFT+V or by pressing the middle mouse button), (finally, type ENTER and CTRL+D to close the file). Then I run on the command line: cat FILE | wcgformat -of The output that appears needs to be copied to the body of the post that you are writing ⇒ Select the output, so it will be copied to the clipboard, then paste the contents of the clipboard to the concept of the post. In this case that will result in: Result Name OS AVN Status Sent Time Due / Return Time CPUh Claimed/Granted[Copied from Workunit Status, generated by wcgformat (using these options: -f -o)] If it so happens that the lines are still too long (after clicking 'Preview'), add another 'f' to the options of wcgformat: cat FILE | wcgformat -off and try again. In this case the output in the post will result in:Result Name OS AVN Status Sent Time Due / Return Time CPUh Claimed/Granted [Copied from Workunit Status, generated by wcgformat (using these options: -ff -o)]As you can see, that wasn't needed in this case. ![]() |
||
|
|
![]() |