Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 3596
Posts: 3596   Pages: 360   [ Previous Page | 266 267 268 269 270 271 272 273 274 275 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5924495 times and has 3595 replies Next Thread
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1330
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

@Speedy51

The retries would've been "teed up" on the 13th when those tasks passed their deadline, but because there seem to have been issues associated with sending out ARP1 tasks it looks as if the tasks went out when not necessarily needed...

I agree that it looks wasteful, but I'm not sure if there's an automatic way of withdrawing a task that has been set up for sending but delayed for that long! -- I think it can be done manually (but I could be wrong, and don't have time to do yet another code-dive into the standard BOINC server stuff at present...)

Cheers - Al.

P.S. This sort of thing has been mentioned in previous posts :-)

[Edited to fix a typo and add the name of person I'm replying to...]
----------------------------------------
[Edit 2 times, last edit by alanb1951 at May 16, 2023 5:24:39 AM]
[May 16, 2023 5:20:13 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Speedy51
Veteran Cruncher
New Zealand
Joined: Nov 4, 2005
Post Count: 1326
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

@Al

in regards to all the automatic withdrawal of a task I believe it can be done as long as the task has not been already started. I agree with what you're saying in regards to the creation of the 2 tasks that I referred to in my post at bottom of the previous page, not sure this applies in this instance as I don't believe it was stuck in the "send queue" as there was quite a short period between the 2nd task been returned and the other 2 been "sent out".

I am looking forward to the big runtime boost that I am going to get when all of my tasks finally "validate" If I don't at least I am helping science
----------------------------------------

[May 16, 2023 8:27:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1330
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

@Al

in regards to all the automatic withdrawal of a task I believe it can be done as long as the task has not been already started.
Oops - sorry about the possible confusion; I was referring to stopping a task from being sent out at all, not cancelling a task that has been sent but hasn't started/reached first checkpoint...
I agree with what you're saying in regards to the creation of the 2 tasks that I referred to in my post at bottom of the previous page, not sure this applies in this instance as I don't believe it was stuck in the "send queue" as there was quite a short period between the 2nd task been returned and the other 2 been "sent out".
The critical time was 6 days after the tasks were sent out -- hence my reference to the 13th :-) -- the teeing up of retries is done by the transitioner when it spots a missed deadline or an error return (and in this case I'll bet those retries were sat with status "Waiting to send" for a while...)
I am looking forward to the big runtime boost that I am going to get when all of my tasks finally "validate" If I don't at least I am helping science

Couldn't agree more about "helping science", though I wish they would sort out the ARP1 workflow properly so we could really help by moving it on a lot faster! (As Mike Gibson has pointed out, if they don't sort it out it'll still be running in 2027...)

Cheers - Al.
[May 16, 2023 9:01:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
catchercradle
Senior Cruncher
England
Joined: Jan 16, 2009
Post Count: 169
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

On the plus side, the latest retries I have crunched have all maxed out my bored band with no risk or RSI from hitting the retry button!.

Is that all sorted now?
[May 16, 2023 11:26:37 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1330
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

On the plus side, the latest retries I have crunched have all maxed out my bored band with no risk or RSI from hitting the retry button!.

Is that all sorted now?
Probably not -- connectivity is only a major problem when there's lots of work flowing, and at present about the only work out there (for any WCG project) seems to be retries, so much lower traffic loads!

Cheers - Al.
[May 16, 2023 12:04:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 1014
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Hi!
The extreme ARP1_0033870_119 have now got 5 pending results.
The initial receiver has now after more than 9 days returned a result.

When looking at my own pending results valid or pending, I just noticed something
peculiar:
Take a look at the checkpointing, they all has the date from February 2019!!??

An example (Extreme mentioned before)

<core_client_version>7.20.2</core_client_version>
<![CDATA[
<stderr_txt>
INFO: Initializing
INFO: No state to restore. Start from the beginning.
Starting WRFMain
[15:53:24] INFO: Checkpoint taken at 2019-02-24_06:00:00
[18:04:44] INFO: Checkpoint taken at 2019-02-24_12:00:00
[20:04:14] INFO: Checkpoint taken at 2019-02-24_18:00:00
[21:23:45] INFO: Checkpoint taken at 2019-02-25_00:00:00
[23:02:05] INFO: Checkpoint taken at 2019-02-25_06:00:00
[01:14:35] INFO: Checkpoint taken at 2019-02-25_12:00:00
[03:20:42] INFO: Checkpoint taken at 2019-02-25_18:00:00
[04:34:19] INFO: Checkpoint taken at 2019-02-26_00:00:00
INFO: Simulation complete compressing output.
04:35:55 (6160): called boinc_finish(0)

</stderr_txt>
]]>

Is this right, feels very odd!!

Greetings from

Hans S.
----------------------------------------
[Edit 1 times, last edit by Hans Sveen at May 17, 2023 11:27:21 AM]
[May 17, 2023 9:09:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1412
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Is tis right, feels very odd!!

That are not the dates of the checkpoints, but the 2 days from which the weather data is analyzed.
IIRC the first 48 hours were from the 1st and 2nd of July 2018 and your task (batch 119) the 24th and 25th of February 2019.
It's written a bit odd. Better would have been Checkpoint taken from in stead of taken at.
[May 17, 2023 9:34:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Hans Sveen
Veteran Cruncher
Norge
Joined: Feb 18, 2008
Post Count: 1014
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Thank You for the info Crystal Pellet!
[May 17, 2023 12:34:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1309
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I'm ARP empty right now, so I took the opportunity to upgrade BOINC. I can't get any SCCs right now, so I'm back to MCMs. I'm 3 days short of my 1 yr badge in MCM, so we will see if I get the badge or if ARPs or SCC show up on my machine.

Are ARPs flowing? new? resends? really slowly? I haven't gotten one in the last 24 hours.
[May 17, 2023 8:27:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
bfmorse
Senior Cruncher
US
Joined: Jul 26, 2009
Post Count: 448
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

sortof.
Last ARP "new" release I received was May 11th.
Everything since then has been resends based on "no reply" and a few "ERRORs".

Since my queue size has recently been increased from ZERO to 0.005 days and my 12 processes sized machines are running LESS than 1/3 full, I am able to clear out any resends rather quickly.
I have also increased ARP's concurrent max from 2 to 5 - but, even that trough doesn't fill.

I have SCC - DESELECTED due to virtually ALL WU's erroring out. and other than an occasional MCM, none of the others are getting WU's . I believe OPN has exhausted their current run.
[May 17, 2023 9:43:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 3596   Pages: 360   [ Previous Page | 266 267 268 269 270 271 272 273 274 275 | Next Page ]
[ Jump to Last Post ]
Post new Thread