Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 781
Posts: 781   Pages: 79   [ Previous Page | 13 14 15 16 17 18 19 20 21 22 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 943854 times and has 780 replies Next Thread
kittyman
Advanced Cruncher
Joined: May 14, 2020
Post Count: 140
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

The kitties have been seeing a lot of stalled downloads as well. Which of course, mucks up fetching new work as well.

Meow hiss.
----------------------------------------

[Apr 27, 2021 2:13:12 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

There doesn't seem to be any way to unstick the stuck jobs other than by restarting it or hoping my queue doesn't run dry.


you can run a script to retry the uploads and downloads on a set interval, breaking BOINC's escalating backoff behavior.

here's the script I'm using on linux:

#!/bin/bash
for i in `./boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do ./boinccmd --file_transfer http://www.worldcommunitygrid.org/ $i retry;done


dump this into a file and save it as whatever you want (i named it "update_transfers_wcg") and place it in the same directory that contains the boinccmd executable. make sure this script is set with proper permissions to allow execution.

then run in a terminal window from the same directory:
watch -n 120 ./update_transfers_wcg


and boom, all transfers for WCG will retry every 120 seconds cool

make whatever modifications you need to the script and/or execution to fit other BOINC installs or OS types. not hard to change what's needed for your own setup.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
----------------------------------------
[Edit 1 times, last edit by Ian-n-Steve C. at Apr 27, 2021 2:22:54 AM]
[Apr 27, 2021 2:20:16 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2498
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

The kitties have been seeing a lot of stalled downloads as well. Which of course, mucks up fetching new work as well.

Meow hiss.

Yup, lots of babysitting, and Retry the pending up and downloads. (I have a spare mouse when this one dies)
Lucky me that I am retired, have a comfortable chair, and all the time in the world. smile

Edit: And we haven't started with the test batches 13345 - 41773 yet. We're still at batches < 7345.
----------------------------------------
[Edit 1 times, last edit by Grumpy Swede at Apr 27, 2021 2:33:11 AM]
[Apr 27, 2021 2:20:55 AM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 280
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

you can run a script to retry the uploads and downloads on a set interval, breaking BOINC's escalating backoff behavior.


Beautiful. I think I'll put it in a cron job with random delay.
[Apr 27, 2021 2:39:16 AM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 280
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Hey, looking good. Just goosed the downloads and... a distinct lack of transient HTTP errors. No more stuck jobs in my queue!
[Apr 27, 2021 2:45:28 AM]   Link   Report threatening or abusive post: please login first  Go to top 
biini
Senior Cruncher
Finland
Joined: Jan 25, 2007
Post Count: 334
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

There doesn't seem to be any way to unstick the stuck jobs other than by restarting it or hoping my queue doesn't run dry.


you can run a script to retry the uploads and downloads on a set interval, breaking BOINC's escalating backoff behavior.

here's the script I'm using on linux:

#!/bin/bash
for i in `./boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do ./boinccmd --file_transfer http://www.worldcommunitygrid.org/ $i retry;done


dump this into a file and save it as whatever you want (i named it "update_transfers_wcg") and place it in the same directory that contains the boinccmd executable. make sure this script is set with proper permissions to allow execution.

then run in a terminal window from the same directory:
watch -n 120 ./update_transfers_wcg


and boom, all transfers for WCG will retry every 120 seconds cool

make whatever modifications you need to the script and/or execution to fit other BOINC installs or OS types. not hard to change what's needed for your own setup.


Is there a similar script for windows/dos command line? I can query the transfers with boinccmd, but from that on I'm lost :-D
----------------------------------------

rtx, xeon, i9, ryzen, rnd laptops
dAM0NES 1991 ppl interested in beer, amigas or electornic music
[Apr 27, 2021 2:53:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 280
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Is there a similar script for windows/dos command line? I can query the transfers with boinccmd, but from that on I'm lost :-D


I don't know enough about it to help, but I think that someone well-versed in PowerShell could whip something up.

In other news, had a big turn-in and everything went smoothly. I think I'll hold off on auto-running the script, but it's nice to have when it's needed. I've put my ARP tasks back into run mode as well.
----------------------------------------
[Edit 1 times, last edit by spRocket at Apr 27, 2021 2:59:12 AM]
[Apr 27, 2021 2:58:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2498
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Things are working better now. No uploads/downloads goes into pending, and the site is not as slow.
However I do think it's mainly because many people has gone to bed (in the U.S mainly, most Europeans are already sleeping since many hours), and their computers are now either shut down, or on longer and longer back-offs.

So, more WU's to the rest of us smile
[Apr 27, 2021 3:05:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 280
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

However I do think it's mainly because many people has gone to bed (in the U.S mainly, most Europeans are already sleeping since many hours), and their computers are now either shut down, or on longer and longer back-offs.


I'm not so sure, since I saw it change from one minute to the next. At any rate, work units are flowing in both directions without undue delays.
[Apr 27, 2021 3:11:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

  • Please post any issues or questions in this thread where we can see them more easily, rather than creating new threads that may be harder for us to track.

    Thanks,
    -Uplinger
  • Well. going by the fact that this test apparently effects the whole WCG website and other WCG projects, I think this is a rather odd place do be directed to...

    And for what it is worth, it seems that this test not only effects the WCG web site but also effects the creation of external stats, at least there haven't been any since noon PST today...

    Ralf confused
    [Apr 27, 2021 3:19:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
    Posts: 781   Pages: 79   [ Previous Page | 13 14 15 16 17 18 19 20 21 22 | Next Page ]
    [ Jump to Last Post ]
    Post new Thread