Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 781
Posts: 781   Pages: 79   [ Previous Page | 15 16 17 18 19 20 21 22 23 24 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 944015 times and has 780 replies Next Thread
zaschf
Advanced Cruncher
New Zealand
Joined: Jan 28, 2009
Post Count: 61
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Up and download speed is noticeably slower here in NZ with 'project backoff' for up to 30 minutes
----------------------------------------
Ubuntu 22.04.1 LTS [Linux 5.15.0-47-generic]
CPU: Intel Core i7-9700 @3.00GHz x 8
GPU: NVIDIA GeForce RTX2060 Rev. A
plus a mighty Raspberry PI 4
[Apr 27, 2021 7:05:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2498
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Yes, Up and downloading is officially terrible, and the worst I've experienced ever on BOINC. SETI had its problems, but this is a new record for a system that still counts as up, and not down. smile

But, I'm sure Uplinger appreciates what we're doing, so he can go out and buy more servers, and possibly also bandwidth. laughing
[Apr 27, 2021 7:33:03 AM]   Link   Report threatening or abusive post: please login first  Go to top 
hnapel
Advanced Cruncher
Netherlands
Joined: Nov 17, 2004
Post Count: 82
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

I'm now getting:

27-4-2021 09:22:08 | World Community Grid | Not requesting tasks: too many uploads in progress

Which is true of course, even nudging them with retries does not work anymore, I guess this thing needs to be throttled down.
[Apr 27, 2021 7:33:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

you can run a script to retry the uploads and downloads on a set interval, breaking BOINC's escalating backoff behavior.

here's the script I'm using on linux:

#!/bin/bash
for i in `./boinccmd --get_file_transfers | sed -n -e 's/^.*name: //p'`;do ./boinccmd --file_transfer http://www.worldcommunitygrid.org/ $i retry;done


dump this into a file and save it as whatever you want (i named it "update_transfers_wcg") and place it in the same directory that contains the boinccmd executable. make sure this script is set with proper permissions to allow execution.

then run in a terminal window from the same directory:
watch -n 120 ./update_transfers_wcg


Thanks, that saved me from writing it wink

make whatever modifications you need to the script and/or execution to fit other BOINC installs or OS types. not hard to change what's needed for your own setup.


Is there a similar script for windows/dos command line? I can query the transfers with boinccmd, but from that on I'm lost :-D

I already had msys2 installed. So, I just had to install the procps-ng package (run 'pacman -S procps-ng' in the msys bash console) and change boinccmd to boinccmd.exe
Go to the boinc folder in the bash console:
cd /c/Program Files/BOINC
watch -n 120 retryBoincTransfers.sh

edit: corrected pacman command
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by goben_2003 at Apr 27, 2021 9:00:35 AM]
[Apr 27, 2021 8:06:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Note: The instructions below are not maintained. Please see this post for the updated version.

Yeah, it really does thrash the SSD. That is why I switched to running off of a ramdisk that syncs to disk on shutdown/reboot. It definitely cut down the writing to my SSD a lot.

Would you mind sharing how you set that up?
I mean I probably could stick something together on my own but why reinvent the wheel if there's a working solution already ^^
I'm also wondering if it would be enough to have the ./slots dir in memory or it has the whole boinc working directory

I installed ImDisk.
IIRC, there is an option about a service in the install. This is required for it to get the shutdown/restart notification and sync to disk. Here are the steps I followed to set it up:
1. Close all programs that might be accessing BOINC. (should be c:\ProgramData\BOINC)
2. Rename BOINC to BOINC.imdisk (the actual name is not important as long as it's different).
3. Create a new directory named BOINC in the same location.
4. Run Imdisk configuration
4a. Set size to what you want. Note that the GPU workunits require a lot more space to be allowed to download than they actually take up. So definitely oversize it otherwise you will get messages in the log that there is not enough disk space. You can use the allocate memory dynamically if you do not have a ton of ram, so the ram is only used if the space actually gets used.
4b. Launch at Windows Startup
4c. I picked a drive letter, but I do not know that it is required since I used a folder mount point instead of as a drive, so it does not show up as a drive.
4d. Advanced -> Use Mount Point -> set to your boinc folder (should be c:\ProgramData\BOINC)
4e. Load Content from Image File or Folder: set this to the folder that you renamed BOINC to (c:\ProgramData\BOINC.imdisk in my example)
4f. Click OK

You can resize later by using the ImDisk Virtual Disk Driver (It is in the Start Menu and is in the control panel). I have only used it to increase the size. It normally works, but one time froze, so I normally change the size with boinc shutdown and the drive saved.

edit: Added a step that I missed.
edit: Added a link to the newer post that is updated.
----------------------------------------

----------------------------------------
[Edit 3 times, last edit by goben_2003 at May 1, 2021 7:57:17 PM]
[Apr 27, 2021 8:21:33 AM]   Link   Report threatening or abusive post: please login first  Go to top 
biini
Senior Cruncher
Finland
Joined: Jan 25, 2007
Post Count: 334
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Strugling with uploads, other than that everything seems to be working quite smoothly.
----------------------------------------

rtx, xeon, i9, ryzen, rnd laptops
dAM0NES 1991 ppl interested in beer, amigas or electornic music
[Apr 27, 2021 8:32:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
alver
Senior Cruncher
Joined: Nov 30, 2007
Post Count: 245
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

My experience so far: I've had 3 GPUs running pretty much flat out since this test started. At the current time, two of them are still getting work, crunching it, and queueing trying to upload.

The third one is now getting the 'Not requesting tasks: too many uploads in progress' message. I guess it's only a matter of time until the other two boxes hit the same blockage - from there, it's self-limiting, it seems.

I guess this is a 'successful test', in as much as we're seeing where the bottleneck(s) are. Hopefully more bandwidth/servers will be forthcoming at some point biggrin
----------------------------------------


(previously known as 'proxima' on SETI, UD, distributed folding, FaD, and Rosetta)
[Apr 27, 2021 8:35:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
goben_2003
Advanced Cruncher
Joined: Jun 16, 2006
Post Count: 146
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Up and download speed is noticeably slower here in NZ with 'project backoff' for up to 30 minutes

Only 30 minutes? wink
I have a few undersea cables between me and the servers, it was up to several hours before I setup the retry script.
The worst are the uploads that get to 100%, but do not get the ACK from the server and have to be retried...
----------------------------------------

[Apr 27, 2021 8:37:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Azmodes
Cruncher
Joined: Apr 4, 2017
Post Count: 3
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Yeah, I got 13 GPUs (each with 4 tasks in tandem) on it since roughly when it started and have had to rely heavily on back-up projects due to stalled/slow transfers in both directions (currently hundreds of pending uploads). Far cry from a steady workflow. I hope whatever insights gained from this server pounding are put into making things more efficient down the line. :)
[Apr 27, 2021 8:41:36 AM]   Link   Report threatening or abusive post: please login first  Go to top 
DrMason
Senior Cruncher
Joined: Mar 16, 2007
Post Count: 153
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

When Uplinger referred to it as an "extreme" stress test, he really wasn't kidding. From the entire site going down for a while, to not being able to upload or download for hours at a time, it really shows that people are eager to GPU crunch and turn around units ASAP! That said, I can't really go around and baby my machines to constantly push "retry now" for hours on end until the servers finally catch up, so it's been a pretty bad experience so far. But I guess the saying "The first ones who break through the wall get bloody" applies, and we're just the first ones through the wall.

I won't go much into technical details, except that I'll throw in more anecdotal evidence that the "project backoff" timers have been brutal. I check from time to time, and it always around 2-6 hours before it even tries to upload or download. The backoff timers apply to both downloads and uploads, so my many-core machines are often just putzing around, waiting for work to come in, even with a .25 day margin. I've noticed that when so many files are waiting to simultaneously upload and download, that it just kills performance in general while crunching. It seems about 25% of crunching power is dedicated towards the Boinc Manager at that point, so about 1 in 4 cycles is going towards network communications. Crazy.

Here's to the next wave who hopefully won't have to break through that same wall again!
----------------------------------------

[Apr 27, 2021 8:47:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 781   Pages: 79   [ Previous Page | 15 16 17 18 19 20 21 22 23 24 | Next Page ]
[ Jump to Last Post ]
Post new Thread