Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 102
Posts: 102   Pages: 11   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 14068 times and has 101 replies Next Thread
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7697
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

Generally with WCG, if re-sends go out after the halfway point of a deadline, the time allowed is only half the standard time, so they only go to those machines which are considered 'reliable', which effectively is those which regularly return units within 3.5 days.

Unless I am misunderstanding your statement, I do not believe that is correct. I believe that a resend would only be sent out if there is an error, an invalid, or there is no reply. The no reply would only come after the deadline of 7 days has passed. The resend would then have a deadline of 3.5 days (half of the original deadline) which should still be sufficient for most machines deemed reliable by WCG algorithm for re-sends.
Edit: If you notice on mine, which ran for a little over 5 days, it completed within the 7 days and there was no resend sent out. The quorum is 2 and the other one only ran 22.39 hours.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Aug 17, 2020 4:46:45 PM]
[Aug 17, 2020 4:43:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

Can't remember if resends issued before the original deadline halftime (error, invalid) get full original deadline or something that equates to original deadline minus time already passed. The resends I see seem to always have half the original deadline whether issued on the first day or the last day before no reply. Checking what it was requires querying the database, which affects performance.

BTW, most of the No Reply resents I get are server aborted before my 1 day deep buffer gets to them, meaning on the 8th day, the original is still good, almost biblical. Great, no double redundancy due slow returners. and 24 hours crunching going to waste at that.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 17, 2020 4:59:27 PM]
[Aug 17, 2020 4:53:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
blyons123
Cruncher
Joined: Jan 2, 2007
Post Count: 9
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

That same comment came from someone else a week or so ago, but no 18 hours crunching on 7 day deadline is still only 2.5 hours a day, were it not that ARP1 does checkpoint only once per 12.5% progress so you have to run BOINC at least 3 hours uninterrupted to reach the next save point.

Why 7 days? The next step depends on the previous result i.e. unitl you finish your result and report there wont be a next task to send out. There's 180 48 hour simulations in a sequence. If everyone would return the result by the maximum allowed time it would take 1260 days (7x180) to do a full 1 year simulation. Too long.

Can't do that, than don't opt in.

You're wrongly assuming a lot!
I'm running many projects at the same time at only 25% cpu. Also I didn't know how long a task would take. BOINC doesn't show an Actual Estimated based on actual time per computation. 1 hr remaining could be 4 actual hours based on settings! And like microbiome, it could run for hours showing no progress. 3-4 weeks would work. 1 year is ridiculous.
[Aug 17, 2020 7:08:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

If I remember back many, many moons. All resends were sent out with half the original time due. Then folks started using AWS instances that allowed the instance to be taken if needed for higher priority work. This caused a major problem due to the number of resends and not enough reliable machines available to execute those aforementioned resends. The feeder ground to a halt. About that time only certain types of resends were designated as true errors and got the expedited return time. WUs that were designated as "Detached" did not get the expedited return and were resent with the standard 7 day turnaround. I think this happened around the time HCC was running on the grid. Uggh!!! I think I've been here too long.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 17, 2020 7:27:16 PM]
[Aug 17, 2020 7:25:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

That same comment came from someone else a week or so ago, but no 18 hours crunching on 7 day deadline is still only 2.5 hours a day, were it not that ARP1 does checkpoint only once per 12.5% progress so you have to run BOINC at least 3 hours uninterrupted to reach the next save point.

Why 7 days? The next step depends on the previous result i.e. unitl you finish your result and report there wont be a next task to send out. There's 180 48 hour simulations in a sequence. If everyone would return the result by the maximum allowed time it would take 1260 days (7x180) to do a full 1 year simulation. Too long.

Can't do that, than don't opt in.

You're wrongly assuming a lot!
I'm running many projects at the same time at only 25% cpu. Also I didn't know how long a task would take. BOINC doesn't show an Actual Estimated based on actual time per computation. 1 hr remaining could be 4 actual hours based on settings! And like microbiome, it could run for hours showing no progress. 3-4 weeks would work. 1 year is ridiculous.

Your OP gave such an abundance of information, and in the absence of further replying on your part...

1) BOINC does give an initial estimated runtime based on the compute capability it tells the project server it has and then will give remaining estimated runtime.
2) MIP not showing progress for hours is about the time they finish on my machine. It shows regular progress in highly granulated percent fraction.
3)At 25% of CPU time or CPU threads? Assuming time, the crunch would take around 4 calendar days, if on 24/7. Running many projects, I'd say if an ARP hits your machine and BOINC learns of the sciences specifics, it's likely to run in high priority.

More assuming on my part: I think your BOINC configuration is broken. If not already done so before, at the very least switch on Leave Application In Memory, when suspended, as else an interrupted task unloads and regresses to the previous checkpoint, loosing the progress a task has since made.
[Aug 17, 2020 7:45:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

If I remember back many, many moons. All resends were sent out with half the original time due. Then folks started using AWS instances that allowed the instance to be taken if needed for higher priority work. This caused a major problem due to the number of resends and not enough reliable machines available to execute those aforementioned resends. The feeder ground to a halt. About that time only certain types of resends were designated as true errors and got the expedited return time. WUs that were designated as "Detached" did not get the expedited return and were resent with the standard 7 day turnaround. I think this happened around the time HCC was running on the grid. Uggh!!! I think I've been here too long.

Detached at times get recovered by the same machine 'lost task' or something and then indeed get original deadline. Between detached and 'lost task' recovery there can't be much time or else the task gets assigned to a different machine. Not in the know what deadline these get. Does not really bother me whatever it is since my buffer is only 1 day deep... whatever arrives is crunched within 24 hours, except ARP. Always have one ready to start to replace the one finishing, so they get back no later than 48 hours, soon enough to still get the occasional repair, which then gets often cancelled because the an original no reply still came back, the copy sitting in wait for 1 day.
[Aug 17, 2020 8:00:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12439
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

lavaflow

My post of 22 hours ago referring to re-sends after the half-way point referred to errors, invalids and aborts after the half way point and also to no reply which would be at the end point of the 7 day deadline. All of those circumstances would generate a re-send with half the deadline. If before the half way point then they would be sent with the full deadline.

Mike
[Aug 17, 2020 10:25:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

lavaflow

My post of 22 hours ago referring to re-sends after the half-way point referred to errors, invalids and aborts after the half way point and also to no reply which would be at the end point of the 7 day deadline. All of those circumstances would generate a re-send with half the deadline. If before the half way point then they would be sent with the full deadline.

Mike

That logic doesn't make a lot of sense to me. What does the execution length have to do with the return deadline? an error is an error and the WU would need to be re-executed in it's entirety. I'm not saying it doesn't happen, it just doesn't make a lot of sense.
[Aug 18, 2020 3:01:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

"If before the half way point then they would be sent with the full deadline."
Novel to me. Until just now always thought that the base deadline of a resent was half the original deadline (7 in case of OPN1), which was until not too long ago 30-35% but caused the feeder clogging in circumstances, so uplinger upped it to half, 50%. The doubting Thomas I am, I went to look for OPN1 log copies of _2 that got recently issued to my machine, more than I ever thought, so guess that answers the reliable state as well:

Result Name             Receive Date             Deadline
OPN1_0007543_08171_2 17/08/20 03:06:37 23/08/20 08:37 > 3.5 days but less than 7.
OPN1_0006958_07277_2 16/08/20 08:04:53 19/08/20 01:43 < less than 3.5 days
OPN1_0007394_06025_2 15/08/20 17:14:47 21/08/20 23:28 > 3.5 days but less than 7.
OPN1_0006628_03918_2 15/08/20 07:14:48 16/08/20 17:31 < less than 3.5 days - Not even 1.5 days
OPN1_0006747_09877_2 15/08/20 04:57:55 18/08/20 00:03 < less than 3.5 days
OPN1_0006846_02540_2 14/08/20 15:58:45 17/08/20 10:22 < less than 3.5 days
OPN1_0006668_01999_2 14/08/20 04:41:59 16/08/20 21:38 < less than 3.5 days
OPN1_0007329_11044_2 13/08/20 11:49:23 19/08/20 17:23 > 3.5 days but less than 7.

The timespan allowed is all over the place, just not the full 7 the original had. It seems to suggest what I speculated on "Can't remember if resends issued before the original deadline halftime (error, invalid) get full original deadline or something that equates to original deadline minus time already passed." i.e. the deadline same as the original.

Oddly, there are those that get less than 3.5 days, sometimes close or less than 2. Maybe it's half time of half time in instances. Clear as mud.
----------------------------------------
[Edit 1 times, last edit by Former Member at Aug 18, 2020 3:14:53 AM]
[Aug 18, 2020 3:11:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12439
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not enough time given for a task

I was referring to the deadline and not to execution time.
[Aug 18, 2020 9:42:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 102   Pages: 11   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread