Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: OpenPandemics GPU Beta Test - Feb 18 2021 [Issues Thread] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 272
|
Author |
|
goben_2003
Advanced Cruncher Joined: Jun 16, 2006 Post Count: 145 Status: Offline Project Badges: |
I've just arrived home and grabbed some work units from this batch. They take only 3 minutes for my PC not 15 as you mentioned. It has a 2070 RTX. Is this a problem? batches are 20053,20055 Since the 2070 RTX is a lot more powerful than the average gpu out there, it is likely that the 15 min is on a less powerful gpu. As an example, my RTX 2060 takes about 4-5 minutes. Cheers, -goben_2003 Edit: So, it varies more than that, just had 1 that took less than 2 minutes, some have taken more than 5 minutes. [Edit 3 times, last edit by goben_2003 at Feb 20, 2021 6:37:18 PM] |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2068 Status: Offline Project Badges: |
OK, NVIDIA GeForce GTX 660M (2048MB), is a no go, even with the latest driver. Still error out the tasks. Can't get any more task now anyhow because of the errors (This computer has finished a daily quota of 1 tasks) I give up on the NVIDIA GeForce GTX 660M (2048MB) too for now.
----------------------------------------The dump files from Windows are available if Uplinger wants them. I'll let them stay in c:\Windows\LiveKernelReports\WATCHDOG\, for a bit longer. It's the usual windows event 4101(Display driver nvlddmkm stopped responding and has successfully recovered.) Result Name: BETA_ OPN1_ 0020021_ 00042_ 0-- <core_client_version>7.16.7</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005)</message> <stderr_txt> projects/www.worldcommunitygrid.org/wcgrid_beta29_autodockgpu_7.21_windows_x86_64__opencl_nvidia_102 -jobs OPN1_0020021_00042.job -input OPN1_0020021_00042.zip -seed 313029375 -wcgruns 2314 -wcgdpf 48 INFO: Using gpu device from app init data 0 INFO:[17:31:42] Start AutoGrid... autogrid4: Successful Completion. INFO:[17:32:00] End AutoGrid... INFO:[17:32:00] Start AutoDock for ZINC000491224981_1_RX1--6lu7_001--CYS145_wcgsplit2.dpf(Job #0)... OpenCL device: GeForce GTX 660M INFO:[17:34:09] End AutoDock... INFO:[17:34:09] Start AutoDock for ZINC000157935703_RX1--6lu7_001--CYS145.dpf(Job #1)... OpenCL device: GeForce GTX 660M Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x00007FF7A2E0DB30 read attempt to address 0x00000000 Engaging BOINC Windows Runtime Debugger... <snipped the rest, since it's available for Uplinger on the server> [Edit 2 times, last edit by Grumpy Swede at Feb 20, 2021 7:00:17 PM] |
||
|
zombie67 [MM]
Senior Cruncher USA Joined: May 26, 2006 Post Count: 228 Status: Offline Project Badges: |
Perhaps these longer running tasks are just a package of multiple small tasks? That could explain the spikes in utilization, rather than a constant load?
----------------------------------------If it stays this way when it goes into production, running multiple BOINC tasks at a time on the GPU could be a solution to maximize utilization. [Edit 1 times, last edit by zombie67 [MM] at Feb 20, 2021 6:39:06 PM] |
||
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3294 Status: Offline Project Badges: |
If you check the log of the BETAs, you'll see it's testing 32 or 46 different ZINC compounds (I've seen those 2 numbers, there may be more variety out there).
----------------------------------------IIRC, 1 OPN CPU task usually has 2 jobs: a short one and a longer one, each testing a different ZINC compound. So, it's quite different. My RX 550 is running these 32/46 Job tasks in 40 minutes using 28-30 watts at 100% GPU utilization while a normal OPN CPU task runs for 3 hours on the Ryzen 1400 where the RX 550 is at. Clearly much more efficient. I have noticed the same behaviour on my RX 550, there are periods when the GPU utilization and power draw drops to 0% and 5 watts, respectively. DayleDiamond, I agree with you but I don't think it will happen. There were a few posts about this on the GPU thread at the OPN forum. However, IIRC, one of the project scientists said that some molecules are better suited for CPU work, so 100% eliminating CPUs for this project would probably never happen. But yeah, it would be much more efficient if all that CPU went to the other non-GPU projects. AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W AMD Ryzen 7 7730U 8C/16T 3.0 GHz [Edit 2 times, last edit by Falconet at Feb 20, 2021 6:52:14 PM] |
||
|
Bryn Mawr
Senior Cruncher Joined: Dec 26, 2018 Post Count: 331 Status: Offline Project Badges: |
So on my very slow GT710s these new units are reporting a requirement of 0.246 CPU and 1 GPU.
----------------------------------------On my 3700x I’ve freed off a CPU to give it access but on the 3600 I’ve left all CPUs running other tasks. Both machines are running (slowly) so on the 3600 it’s obviously borrowing CPU time from the other tasks but the 3600 is running quicker with an expected completion time of 3:05:00 against the 3700x with an expected run time of 5:10:00. This is also reflected in the update time with the 3600 adding 2% every 4 minutes but the 3700x taking 7 minutes to add each 2% chunk. ETA The 3600 is checkpointing but the 3700x does not appear to be which confuses me. [Edit 1 times, last edit by Bryn Mawr at Feb 20, 2021 6:54:52 PM] |
||
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3294 Status: Offline Project Badges: |
Interesting, on my Ryzen 1400/RX 550, it's 0.501 CPUs and 1 GPU.
----------------------------------------I might try leaving 1 thread for the GPU, see if the runtime is lower. AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
ChrisR1964
Cruncher Usa Joined: Dec 5, 2005 Post Count: 17 Status: Offline Project Badges: |
I've had five beta work units fail. They run for many hours at 100%. Here is the top of my Event Log.
----------------------------------------2/19/2021 8:17:32 PM | | Starting BOINC client version 7.16.5 for windows_x86_64 2/19/2021 8:17:32 PM | | log flags: file_xfer, sched_ops, task 2/19/2021 8:17:32 PM | | Libraries: libcurl/7.47.1 OpenSSL/1.0.2s zlib/1.2.8 2/19/2021 8:17:32 PM | | Data directory: C:\ProgramData\BOINC 2/19/2021 8:17:32 PM | | Running under account ChrisR 2/19/2021 8:17:34 PM | | CAL: ATI GPU 0: AMD Radeon HD 6520G/6530D/6550D/6620G (SuperSumo) (CAL version 1.4.1848, 512MB, 479MB available, 568 GFLOPS peak) 2/19/2021 8:17:34 PM | | OpenCL: AMD/ATI GPU 0: AMD Radeon HD 6520G/6530D/6550D/6620G (SuperSumo) (driver version 1800.11 (VM), device version OpenCL 1.2 AMD-APP (1800.11), 512MB, 479MB available, 568 GFLOPS peak) 2/19/2021 8:17:34 PM | | Windows processor group 0: 4 processors 2/19/2021 8:17:34 PM | | Host name: ChrisR-Desktop 2/19/2021 8:17:34 PM | | Processor: 4 AuthenticAMD AMD A6-3620 APU with Radeon(tm) HD Graphics [Family 18 Model 1 Stepping 0] 2/19/2021 8:17:34 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 popcnt syscall nx lm svm sse4a osvw ibs skinit wdt page1gb rdtscp 3dnowext 3dnow 2/19/2021 8:17:34 PM | | OS: Microsoft Windows 10: Core x64 Edition, (10.00.19042.00) 2/19/2021 8:17:34 PM | | Memory: 7.49 GB physical, 7.99 GB virtual 2/19/2021 8:17:34 PM | | Disk: 419.83 GB total, 242.39 GB free 2/19/2021 8:17:34 PM | | Local time is UTC -5 hours 2/19/2021 8:17:34 PM | | No WSL found. here is a link to one of the result logs: https://www.worldcommunitygrid.org/ms/device/...og.do?resultId=1512333676 Let me know if you need anything else. Thank you [Edit 1 times, last edit by ChrisR1964 at Feb 21, 2021 5:33:05 PM] |
||
|
DrMason
Senior Cruncher Joined: Mar 16, 2007 Post Count: 153 Status: Offline Project Badges: |
It’s interesting. I have two Windows machines with gpus crunching BETAs just fine, and one Linux machine with a gpu that has yet to get a BETA. I’ve double checked, and all machines report a gpu that’s recognized by the BOINC manager, and the device profiles have all gpu types selected “yes” to be able to crunch units. The Linux box requests work frequently, and has all the other cores loaded with work, so I don’t know what’s preventing it from getting additional work. I wonder if it’s the machine, the GPU, or just luck at this point. The Linux machine has the oldest GPU (a 660ti), which is supposed to be OpenCL 1.2, but I wonder if it’s too old...
---------------------------------------- |
||
|
JWustmann
Cruncher Joined: Mar 27, 2020 Post Count: 3 Status: Offline Project Badges: |
Hello, this is my first forum post on any Boinc subject ever.
I opted in for beta to help and want to share now the intermittent utilisation of my 1050 ti mobile: jumping vom 0% to 100% every few seconds (image: https://ibb.co/XCb2Pp0 ) Milkyway@home as a comparison utilizes full 100% all the time |
||
|
goben_2003
Advanced Cruncher Joined: Jun 16, 2006 Post Count: 145 Status: Offline Project Badges: |
Perhaps these longer running tasks are just a package of multiple small tasks? That could explain the spikes in utilization, rather than a constant load? That was my guess. The periods of 0% usage line up with when 1 auto dock stops and another begins for both the nvidia and intel jobs. Most of the 10 seconds between start and end for the nvidia must be spent loading data to and retrieving from the gpu, since the spikes are much shorter than 10 seconds. I hope this is taken care of before it gets out of beta, since it results in pretty inefficient use of my gpu. Nvidia: INFO:[21:00:37] Start AutoDock for ZINC001075819252_RX1--6lu7_001--CYS145.dpf(Job #31)... Intel: INFO:[20:52:01] Start AutoDock for ZINC000492810983_RX1--6lu7_001--CYS145.dpf(Job #1)... |
||
|
|