Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 3596
Posts: 3596   Pages: 360   [ Previous Page | 211 212 213 214 215 216 217 218 219 220 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5931525 times and has 3595 replies Next Thread
catchercradle
Senior Cruncher
England
Joined: Jan 16, 2009
Post Count: 171
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Haven't paid enough attention to know if any of my tasks are extremes. My Ryzen7 is only picking up the odd resend (2 tasks on their third run currently crunching.)
[Sep 19, 2022 7:32:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2355
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I'm sorry to trouble you, Mike:
The situation with downloading units is dire..
Would you please care to explain what the problem is, since I haven't seen a "transient HTTP error" since 13 September and you copied that line from your Sunday 11 September Report (post 676290), when downloading any task was (indeed) still dire.
[Sep 19, 2022 10:19:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Unixchick
Veteran Cruncher
Joined: Apr 16, 2020
Post Count: 1312
Status: Recently Active
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I'll take a guess at the "dire" comment.

First, I am only getting resends and not "fresh" new ARP WUs, as their is an issue with some part of the transitioner/validator bit. Second, fewer WUs were completed this week than the week before. It is dire compared to the 10k plus we used to chew through in a day.

We are all looking forward to the point when we are making good progress on ARP !
[Sep 19, 2022 2:12:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I have just spent over an hour clicking resend in order to get 8 ARP units to download. If I didn't, I would get no downloads of any variety because they were pending.

This has been a continuous situation and it eats into deadlines and could push units over their deadline without even starting.

I suspect this could be the reason that the average reporting time for all ARP units is over 10 days - even extremes.

This is not a problem with OPN1 or MCM1 probably due to them having much smaller downloads. ARP1 downloads are split into about 8 modules and can be 100 MB in total. One of the 8 is 47.1 MB on its own.

I don't know what is causing the problem but I suspect that the units are being released in batches and many people are trying to download them at the same time with insufficient bandwidth at Krembil. Perhaps the techs could enlighten us?

Mike
[Sep 20, 2022 1:02:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
catchercradle
Senior Cruncher
England
Joined: Jan 16, 2009
Post Count: 171
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

Since last night I have been getting enough tasks to keep 8 cores busy. All except two of my last 14 are _0 or _1, so things are marginally better at least in the short term. Still needing to encourage downloads by clicking "Retry Now" however.
[Sep 21, 2022 2:12:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 786
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

To automate retries I use Adri's wcgresults from
https://sourceforge.net/projects/wcgtools/files/
On Linux use command crontab -e to create a timer to run every 15 minutes with option -x.

Paul.
----------------------------------------
Paul.
[Sep 21, 2022 2:44:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
catchercradle
Senior Cruncher
England
Joined: Jan 16, 2009
Post Count: 171
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

To automate retries I use Adri's wcgresults from
https://sourceforge.net/projects/wcgtools/files/
On Linux use command crontab -e to create a timer to run every 15 minutes with option -x.

Paul.
Thanks for that. Probably won't use it with my other project CPDN however as when they have problems with servers, it tends to be completely dead till fixed and there is a danger of going up the maximum number of retrys allowed.
[Sep 22, 2022 5:06:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2355
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

To automate retries I use Adri's wcgresults from
https://sourceforge.net/projects/wcgtools/files/
On Linux use command crontab -e to create a timer to run every 15 minutes with option -x.

Paul.
Thanks for that. Probably won't use it with my other project CPDN however as when they have problems with servers, it tends to be completely dead till fixed and there is a danger of going up the maximum number of retrys allowed.

Could you please enlighten us how the maximum number of retries allowed is computed for CPDN?
[Sep 22, 2022 8:24:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 280
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I've written a quick-and-dirty Perl script to babysit my crunchers. It's been running for a day now and has kept the work units flowing beautifully. I run it inside a terminal window (or a tmux session).


#!/usr/bin/perl

use strict;

my $sleep = 5; # Sleep time between iterations.
my $url = "http://www.worldcommunitygrid.org"; # BOINC project URL.

# Replace with your crunchers' hostnames or IP addresses.
my @hosts = ("localhost", "foo.example.com", "10.10.10.10");

for(;;) {
foreach my $host (@hosts) {
my $count = 0;
open XFERS, "boinccmd --host $host --get_file_transfers|";
my $name = undef; my $dir = undef; my $active = undef;
while (<XFERS>) {
($name) = (m/name: (\S+)$/o) if($name eq undef);
($dir) = (m/direction: (\S+)$/o) if($dir eq undef);
($active) = (m/xfer active: (\S+)$/o) if($active eq undef);
# We have a transfer if we get the name, status, and direction.
if($name ne undef && $active ne undef && $dir ne undef) {
# Assume it's stuck if it isn't active.
if($active eq "no") {
$count++;
system "boinccmd --host $host --file_transfer $url $name retry";
}
$name = undef; $dir = undef; $active = undef;
}
}
close XFERS;
print localtime . " $host: $count stuck transfers retried\n" if $count;
}
sleep $sleep;
}

[Sep 22, 2022 1:09:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
PMH_UK
Veteran Cruncher
UK
Joined: Apr 26, 2007
Post Count: 786
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Work Available

I have modified my copy of wcgresults to have an exclude list for other projects.
Also added mesages & commented out some to minimise noise.
I will post details to Adri's announcement thread later.

Paul.
----------------------------------------
Paul.
[Sep 22, 2022 2:28:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 3596   Pages: 360   [ Previous Page | 211 212 213 214 215 216 217 218 219 220 | Next Page ]
[ Jump to Last Post ]
Post new Thread