Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 781
Posts: 781   Pages: 79   [ Previous Page | 20 21 22 23 24 25 26 27 28 29 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 762393 times and has 780 replies Next Thread
Grumpy Swede
Master Cruncher
Svíþjóð
Joined: Apr 10, 2020
Post Count: 2278
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Yup batches 13345 - 41773 seems to be of the larger type. More "jobs"/WU, and maybe more complicated "jobs". They do take considerably longer time to crunch though, and seems to take some time at the start, before they really begin crunching. Meaning the percentage starts rising from the beginning, but that's BOINC's pseudo-progress, then after some time, it backs down to the real percentage, after the first "job" is done. I haven't seen that on the previous batches, other than on my really slow GPU's (GTX660M, and iGPU HD4600)

However, no peace of mind, for those with BOINC on SSD's. They still hammer the disk between each "job", when they checkpoint.

I have BOINC on a HD, so I'm not that worried.
----------------------------------------
[Edit 1 times, last edit by Grumpy Swede at Apr 27, 2021 1:18:41 PM]
[Apr 27, 2021 1:16:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Biscotto
Cruncher
Italy
Joined: Apr 11, 2020
Post Count: 27
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Does the continous disk writing i see people complaining about happen for temporary files? I'm not experiencing it, but i have my temporary folder (tmp) mounted in ramdisk through tmpfs, and i only see on the SSD the occasional "dumps" of data, every few minutes. Can this be it?
----------------------------------------
Papa Ryzen 5 3600 / Mama Radeon RX 560

[Apr 27, 2021 1:16:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 277
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Looks like someone must have kicked the upload server... there were a couple of connection failures, then things started draining. I told BOINC to send the rest of the WUs, and the stuck ones promptly became unstuck.

Regarding SSD write volume: I've been concerned about that as well, so last spring I transferred /var/lib/boinc-client to spinning rust (in my case, a ZFS dataset).
[Apr 27, 2021 1:18:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Eurwin
Cruncher
Joined: Apr 28, 2007
Post Count: 17
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Lot's of uploads go to 100% but somehow do not complete.



I had the same behaviour for a couple of hours.
Now things look "normal" again.
[Apr 27, 2021 1:18:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Does the continous disk writing i see people complaining about happen for temporary files? I'm not experiencing it, but i have my temporary folder (tmp) mounted in ramdisk through tmpfs, and i only see on the SSD the occasional "dumps" of data, every few minutes. Can this be it?


I wouldn't worry about it. the fears of SSD writes are largely FUD and blown out of proportion. most modern SSDs can handle PETABYTES of writes before failure is a concern and they have more advanced wear leveling than earlier SSDs. that's continuous writing for 10+ years in most cases. and real world use will be far below that.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
[Apr 27, 2021 1:23:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
maeax
Advanced Cruncher
Joined: May 2, 2007
Post Count: 142
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

One Ellesmere with HDD and one Ellesmere with SSD.

The SSD one is crashing very often, because of a lot Checkpoints.
Boinc is set to 1200 sec. for backup, but OPNG ignore this.
Now are the longrunning OPNG-Tasks running on it (1 hour!).
Something is wrong with checkpointing and SSD.
https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=639284992
----------------------------------------
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
[Apr 27, 2021 1:27:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
spRocket
Senior Cruncher
Joined: Mar 25, 2020
Post Count: 277
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

We're not out of the woods yet. Getting some stuck downloads now, while uploads are fine.
[Apr 27, 2021 1:31:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

Just an FYI to those that may be running multiple tasks per GPU. These 5 digit batches don't seem to like that, at least not on my older but still powerful hardware, both AMD and Nvidia, Windoze and linux. BOLO errors.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 1 times, last edit by nanoprobe at Apr 27, 2021 1:37:02 PM]
[Apr 27, 2021 1:36:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Michael Goetz
Cruncher
United States
Joined: Dec 11, 2017
Post Count: 35
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

What I would need is a script for Windows10.

Ask and You shall receive!
I had a same problem, so made a little script.
----WARNING----
This script will retry ALL your uploads and downloads, no matter if they are active, pending or stalled, so don't schedule this to run too often.
-----WARNING----

use at your own risk:


@echo off

"c:\program files\boinc\boinccmd.exe" --get_file_transfers |find "name" > %tmp%\transferqueue.txt

for /F "tokens=2 delims= " %%G IN (%tmp%\transferqueue.txt) do (
echo %date%%time% retrying ... %%G >> %tmp%\transfer_retry.log
"c:\program files\boinc\boinccmd.exe" --file_transfer www.worldcommunitygrid.org %%G retry
)

del %tmp%\transferqueue.txt


Also, increase the max_file_xfers_per_project in cc_config.xml


PRO TIP: There's a MUCH easier way to do this. It's a one-liner.

There's a "retry all" command in the BOINC client command line interface: "--network_available". This is all you need:

"c:\program files\boinc\boinccmd.exe" --network_available

Then add loop/repeat controls as appropriate to your desires and scripting language.
[Apr 27, 2021 1:36:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Ian-n-Steve C.
Senior Cruncher
United States
Joined: May 15, 2020
Post Count: 180
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: OpenPandemics - GPU Stress Test

One Ellesmere with HDD and one Ellesmere with SSD.

The SSD one is crashing very often, because of a lot Checkpoints.
Boinc is set to 1200 sec. for backup, but OPNG ignore this.
Now are the longrunning OPNG-Tasks running on it (1 hour!).
Something is wrong with checkpointing and SSD.
https://www.worldcommunitygrid.org/ms/device/...s.do?workunitId=639284992


Either something is wrong with your SSD, or something else is wrong with the system with the SSD. My systems are much faster and running 6-8 GPUs and producing many more writes to the SSD, but with no issues. SSDs in general are capable of many orders of magnitude more IOPs than a HDD. Your problem is likely system-specific, not SSD-specific.
----------------------------------------

EPYC 7V12 / [5] RTX A4000
EPYC 7B12 / [5] RTX 3080Ti + [2] RTX 2080Ti
EPYC 7B12 / [6] RTX 3070Ti + [2] RTX 3060
[2] EPYC 7642 / [2] RTX 2080Ti
[Apr 27, 2021 1:38:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 781   Pages: 79   [ Previous Page | 20 21 22 23 24 25 26 27 28 29 | Next Page ]
[ Jump to Last Post ]
Post new Thread