| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 3596
|
|
| Author |
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1330 Status: Recently Active Project Badges:
|
@Speedy51
----------------------------------------The retries would've been "teed up" on the 13th when those tasks passed their deadline, but because there seem to have been issues associated with sending out ARP1 tasks it looks as if the tasks went out when not necessarily needed... I agree that it looks wasteful, but I'm not sure if there's an automatic way of withdrawing a task that has been set up for sending but delayed for that long! -- I think it can be done manually (but I could be wrong, and don't have time to do yet another code-dive into the standard BOINC server stuff at present...) Cheers - Al. P.S. This sort of thing has been mentioned in previous posts :-) [Edited to fix a typo and add the name of person I'm replying to...] [Edit 2 times, last edit by alanb1951 at May 16, 2023 5:24:39 AM] |
||
|
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1326 Status: Offline Project Badges:
|
@Al
----------------------------------------in regards to all the automatic withdrawal of a task I believe it can be done as long as the task has not been already started. I agree with what you're saying in regards to the creation of the 2 tasks that I referred to in my post at bottom of the previous page, not sure this applies in this instance as I don't believe it was stuck in the "send queue" as there was quite a short period between the 2nd task been returned and the other 2 been "sent out". I am looking forward to the big runtime boost that I am going to get when all of my tasks finally "validate" If I don't at least I am helping science ![]() |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1330 Status: Recently Active Project Badges:
|
@Al Oops - sorry about the possible confusion; I was referring to stopping a task from being sent out at all, not cancelling a task that has been sent but hasn't started/reached first checkpoint...in regards to all the automatic withdrawal of a task I believe it can be done as long as the task has not been already started. I agree with what you're saying in regards to the creation of the 2 tasks that I referred to in my post at bottom of the previous page, not sure this applies in this instance as I don't believe it was stuck in the "send queue" as there was quite a short period between the 2nd task been returned and the other 2 been "sent out". The critical time was 6 days after the tasks were sent out -- hence my reference to the 13th :-) -- the teeing up of retries is done by the transitioner when it spots a missed deadline or an error return (and in this case I'll bet those retries were sat with status "Waiting to send" for a while...)I am looking forward to the big runtime boost that I am going to get when all of my tasks finally "validate" If I don't at least I am helping science Couldn't agree more about "helping science", though I wish they would sort out the ARP1 workflow properly so we could really help by moving it on a lot faster! (As Mike Gibson has pointed out, if they don't sort it out it'll still be running in 2027...) Cheers - Al. |
||
|
|
catchercradle
Senior Cruncher England Joined: Jan 16, 2009 Post Count: 169 Status: Offline Project Badges:
|
On the plus side, the latest retries I have crunched have all maxed out my bored band with no risk or RSI from hitting the retry button!.
Is that all sorted now? |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1330 Status: Recently Active Project Badges:
|
On the plus side, the latest retries I have crunched have all maxed out my bored band with no risk or RSI from hitting the retry button!. Probably not -- connectivity is only a major problem when there's lots of work flowing, and at present about the only work out there (for any WCG project) seems to be retries, so much lower traffic loads!Is that all sorted now? Cheers - Al. |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 1014 Status: Offline Project Badges:
|
Hi!
----------------------------------------The extreme ARP1_0033870_119 have now got 5 pending results. The initial receiver has now after more than 9 days returned a result. When looking at my own pending results valid or pending, I just noticed something peculiar: Take a look at the checkpointing, they all has the date from February 2019!!?? An example (Extreme mentioned before) <core_client_version>7.20.2</core_client_version> <![CDATA[ <stderr_txt> INFO: Initializing INFO: No state to restore. Start from the beginning. Starting WRFMain [15:53:24] INFO: Checkpoint taken at 2019-02-24_06:00:00 [18:04:44] INFO: Checkpoint taken at 2019-02-24_12:00:00 [20:04:14] INFO: Checkpoint taken at 2019-02-24_18:00:00 [21:23:45] INFO: Checkpoint taken at 2019-02-25_00:00:00 [23:02:05] INFO: Checkpoint taken at 2019-02-25_06:00:00 [01:14:35] INFO: Checkpoint taken at 2019-02-25_12:00:00 [03:20:42] INFO: Checkpoint taken at 2019-02-25_18:00:00 [04:34:19] INFO: Checkpoint taken at 2019-02-26_00:00:00 INFO: Simulation complete compressing output. 04:35:55 (6160): called boinc_finish(0) </stderr_txt> ]]> Is this right, feels very odd!! Greetings from Hans S. [Edit 1 times, last edit by Hans Sveen at May 17, 2023 11:27:21 AM] |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1412 Status: Offline Project Badges:
|
Is tis right, feels very odd!! That are not the dates of the checkpoints, but the 2 days from which the weather data is analyzed. IIRC the first 48 hours were from the 1st and 2nd of July 2018 and your task (batch 119) the 24th and 25th of February 2019. It's written a bit odd. Better would have been Checkpoint taken from in stead of taken at. |
||
|
|
Hans Sveen
Veteran Cruncher Norge Joined: Feb 18, 2008 Post Count: 1014 Status: Offline Project Badges:
|
Thank You for the info Crystal Pellet!
|
||
|
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1309 Status: Offline Project Badges:
|
I'm ARP empty right now, so I took the opportunity to upgrade BOINC. I can't get any SCCs right now, so I'm back to MCMs. I'm 3 days short of my 1 yr badge in MCM, so we will see if I get the badge or if ARPs or SCC show up on my machine.
Are ARPs flowing? new? resends? really slowly? I haven't gotten one in the last 24 hours. |
||
|
|
bfmorse
Senior Cruncher US Joined: Jul 26, 2009 Post Count: 448 Status: Offline Project Badges:
|
sortof.
Last ARP "new" release I received was May 11th. Everything since then has been resends based on "no reply" and a few "ERRORs". Since my queue size has recently been increased from ZERO to 0.005 days and my 12 processes sized machines are running LESS than 1/3 full, I am able to clear out any resends rather quickly. I have also increased ARP's concurrent max from 2 to 5 - but, even that trough doesn't fill. I have SCC - DESELECTED due to virtually ALL WU's erroring out. and other than an occasional MCM, none of the others are getting WU's . I believe OPN has exhausted their current run. |
||
|
|
|