Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 18
Posts: 18   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 14108 times and has 17 replies Next Thread
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

Woke up this morning to find that the estimated time had dropped to 2 minutes and 59 seconds again. Wondered what would happen with this task. No surprise it got aborted, after 90 minutes, with 106 jobs aboard and sent to the moon after its 101st job.

exceeded elapsed time limit 5392.30 (943491.36G/174.97G)

OPNG_0050331_00093_1--   Linux Debian   -     In Progress            6/12/21 09:51:47    6/15/21 09:51:47    0.00       0.0 / 0.0
OPNG_0050331_00093_0-- Linux Ubuntu 728 Error 6/11/21 01:32:59 6/12/21 09:51:41 0.08 1.3 / 0.0


EDIT: Drastic measures. devilish From now on I will be aborting received OPNG tasks with too many jobs aboard.
----------------------------------------
[Edit 2 times, last edit by adriverhoef at Jun 12, 2021 8:45:44 PM]
[Jun 12, 2021 10:46:40 AM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 786
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

Adri,
Thanks for the script.
The queue on my slow Intel GPUs are a day or more so I need to change every day.
Would the below addition to your script correctly change estimate and limit?
Having re-checked I probably don't need to change estimate, if anything it is high.

# Stop BOINC:
sudo systemctl stop boinc-client
# Increase the estimated time for OPNG tasks:
sudo perl -w -i -p -e '
if (($b,$v,$e) = /(<rsc_fpops_est>)(31449712079576)(\.000000<\/rsc_fpops_est>)/) {
$v = 2 * $v;
s//$b$v$e/;
}
if (($b,$v,$e) = /(<rsc_fpops_bound>)(943491362387280)(\.000000<\/rsc_fpops_bound>)/) {
$v = 2 * $v;
s//$b$v$e/;
}
' $CLIENT_STATE
# Restart BOINC:
sudo systemctl start boinc-client

How do you post without formatting getting mangled?

Paul.
----------------------------------------
Paul.
----------------------------------------
[Edit 1 times, last edit by PMH_UK at Jun 12, 2021 4:13:07 PM]
[Jun 12, 2021 4:11:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

Hi Paul,
First of all, to answer your question about formatting, you'd have to use the 'Code' button (underneath 'B', 'I', 'U', 'S', 'Size', 'Font' and 'Color', between 'Email', 'Image' and 'List', 'Quote'). Select the -to be formatted- text in your message, click 'Code', presto!

Furthermore, you wrote:
Adri,
Thanks for the script.
If you can put it to use, than it sounds great to me.

The queue on my slow Intel GPUs are a day or more so I need to change every day.
My queues are about one day, too; the only problematic system, concerning estimated time, is a laptop.

Would the below addition to your script correctly change estimate and limit?
I've tested it without the -i option and it works as expected. So, you did a good job there! smile
With formatting codes it would look like:

# Stop BOINC:
sudo systemctl stop boinc-client
# Increase the estimated time for OPNG tasks:
sudo perl -w -i -p -e '
if (($b,$v,$e) = /(<rsc_fpops_est>)(31449712079576)(\.000000<\/rsc_fpops_est>)/) {
$v = 2 * $v;
s//$b$v$e/;
}
if (($b,$v,$e) = /(<rsc_fpops_bound>)(943491362387280)(\.000000<\/rsc_fpops_bound>)/) {
$v = 2 * $v;
s//$b$v$e/;
}
' $CLIENT_STATE
# Restart BOINC:
sudo systemctl start boinc-client

[Jun 12, 2021 5:17:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1404
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

Is it just coïncidence?
All 4 tasks of batch 53k I got, are all over 100 inside jobs: 117, 132, 138 and 118 jobs.
[Jun 16, 2021 7:37:26 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2346
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

Sigh. It's happening again: too many jobs in an OPNG-task. Two OPNG-tasks have already been aborted over the past day by the BOINC-client because of the 'exceeded elapsed time limit 7028.48 (1257988.48G/178.98G)' error message after only 1 hour and 55 minutes of execution time.

The first one that was ServerClient Aborted alarmed me and I directly User Aborted all OPNG-tasks with more than 90 inside jobs.
Lo and behold, a few hours later an OPNG-task with 74 tasks inside already got ServerClient Aborted after 66 inside jobs were processed.

I guess I will be aborting all OPNG-tasks with too many inside jobs then again, like I did a half year ago, because every new task that arrives is able to reset the estimated time to three minutes or even less.
----------------------------------------
[Edit 2 times, last edit by adriverhoef at Jan 13, 2022 1:14:44 AM]
[Jan 12, 2022 11:51:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
alanb1951
Veteran Cruncher
Joined: Jan 20, 2006
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

Yup, Adri -- the average number of jobs in tasks I've received since the start of 11th January has been over 80 - fortunately I have GPUs that can get through the work in under half an hour :-)

It would probably be an idea for the plan class for Intel GPUs to be designed to cut out the slow and ancient if that's possible. Alternatively, perhaps a cap on the number of jobs in a task (as they did for OPN1 when it had lots of little jobs per task and caused havoc on some systems...)

I've just tried to use "Contact us" to ask if they can cap the jobs per task, but I don't know whether it got through -- there's no acknowledgment given on hitting Submit, it just returns to the "Contact us" page with all the fields blank! We shall see...

Cheers - Al.
[Jan 13, 2022 1:25:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2494
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

I can crunch these huge WU's without a problem on my old and slow GTX660M.
They do take 5-6 hours, but they finish just fine.

This WU contains 95 jobs, but the old GPU crunched it without any problems:
https://www.worldcommunitygrid.org/contribution/workunit/112536555

Added: I'll let my Intel HD4600 crunch a few of them too, just to check it out. But previously when we had high jobs WU's, the HD4600 didn't have any problems to finish them.
----------------------------------------
[Edit 1 times, last edit by Grumpy Swede at Jan 13, 2022 1:37:09 AM]
[Jan 13, 2022 1:31:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Keith Myers
Senior Cruncher
USA
Joined: Apr 6, 2021
Post Count: 193
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Insane short time to complete, tasks get aborted too early

All of my errored tasks get 70-80% finished of the jobs contained in the work unit.
They fail around job #70 or so usually.
----------------------------------------

A proud member of the OFA (Old Farts Association)
[Jan 14, 2022 3:25:01 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 18   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread