Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 781
|
![]() |
Author |
|
William Albert
Cruncher Joined: Apr 5, 2020 Post Count: 39 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
I've shut down two machines equipped with an Nvidia GeForce GT 720 and an Intel HD Graphics 530.
----------------------------------------There wasn't a large enough supply of Intel WUs to keep the Intel GPUs running, and the GT 720 is slow enough that about half of the Nvidia WUs were timing out at the 98% mark or so. These are bog-standard Lenovo desktop PCs (with a basic video card for additional displays) that you'd find in any number of schools and businesses throughout the world. While these GPUs are no match for the powerful Nvidia and AMD GPUs used in workstations and enthusiast rigs, they're still more powerful than common desktop CPUs, and there's enough of them that writing them off as too slow might end up excluding a lot of aggregate power from participating in OPNG (or any future GPU-powered applications). Anyway, I've documented my issues with the GeForce GT 720 if an admin wants to follow up. [Edit 1 times, last edit by William Albert at Apr 27, 2021 10:42:11 PM] |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Right now i have 0 GPU work units and have not received any in the last 20 hours so are the GPU work units still going out to the people Can you post your message log? There should be good supply for any GPU's able to process the tasks. |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I There wasn't a large enough supply of Intel WUs to keep the Intel GPUs running, and the GT 720 is slow enough that about half of the Nvidia WUs were timing out at the 98% mark or so. Can you post your message log from when your computer attempted to request work for the Intel GPUs? There should be plenty of supply available at this point for everyone who asks for work. |
||
|
William Albert
Cruncher Joined: Apr 5, 2020 Post Count: 39 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
I don't have those particular PCs online anymore, but an example log from another PC that is also unable to get any Intel WUs is below.
This computer has identical hardware specs to one another that is happily crunching Intel WUs.
|
||
|
Azmodes
Cruncher Joined: Apr 4, 2017 Post Count: 3 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
thanks uplinger, any comment about what's going on with the low GPU utilization (lots of GPU idle time) of the 5-digit batches? you had mentioned that you though they should run fast. I even confirmed that the process is constantly comming on and off the GPU. you can catch times running nvidia-smi where it shows the wcg application isnt even running on the GPU, while BOINC shows it running. and it'll constantly pop in and out. this is much different than all the tasks before, where even if the sub jobs were starting and stopping, nvidia-smi still recognized that the application was running on the GPU. I'm seeing the exact same thing. GPU utilization is way down, CPU time ends up only being a quarter of the task (whereas it was about 100% before) and the processes keep showing up and vanishing again in nvidia-smi. Unsurprisingly runtimes appear to be longer. [Edit 1 times, last edit by Azmodes at Apr 27, 2021 11:03:57 PM] |
||
|
m0320174
Cruncher Joined: Feb 13, 2021 Post Count: 11 Status: Offline Project Badges: ![]() ![]() ![]() ![]() |
Right now i have 0 GPU work units and have not received any in the last 20 hours so are the GPU work units still going out to the people Can you post your message log? There should be good supply for any GPU's able to process the tasks. That's a bit optimistic, I'm trying to build up a cache of (Nvidia) GPU workunits but it's impossible because the majority of my requests are unsuccessful: 04/28/21 01:07:50 | World Community Grid | Requesting new tasks for NVIDIA GPU 04/28/21 01:07:51 | World Community Grid | Scheduler request completed: got 0 new tasks 04/28/21 01:07:51 | World Community Grid | No tasks sent 04/28/21 01:07:51 | World Community Grid | No tasks are available for OpenPandemics - COVID 19 04/28/21 01:07:51 | World Community Grid | No tasks are available for OpenPandemics - COVID-19 - GPU 04/28/21 01:07:51 | World Community Grid | No tasks are available for Africa Rainfall Project 04/28/21 01:07:51 | World Community Grid | No tasks are available for Microbiome Immunity Project 04/28/21 01:07:51 | World Community Grid | No tasks are available for Help Stop TB 04/28/21 01:07:51 | World Community Grid | No tasks are available for Smash Childhood Cancer 04/28/21 01:07:51 | World Community Grid | Tasks for Intel GPU are available, but your preferences are set to not accept them This is not that much of an issue because I currently have a buffer of 8 GPU workunits, sufficient for roughly 1 hour of processing. But, this is not what I call a good supply. I also ran out of work twice in the the last couple of hours. So, I think there are still some optimizations to be done at server side. |
||
|
gordonbb
Cruncher Canada Joined: May 14, 2019 Post Count: 19 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I just spun up 10 more Nvidia GPUs across 5 systems that were on F@H during the expensive Time-of-Use Electricity rates here (nice that these tasks are lighter in terms of power load) and had no issues getting their job queues loaded with GPU tasks. The TOU period is ending so I've set the systems to "No New Tasks" and am emptying the queues and will put them back on F@H. At least these GPU jobs are short compared to the CPU jobs so I won't run out of CPU tasks even on my 3950x :-)
----------------------------------------Things are much better with the big jobs compared to this time yesterday. I've stopped the scripts that were forcing transfers as they are no longer needed. I'm seeing only the very occasional transfer back-off. Looking forward to these tasks coming out of "beta-beta" and into production just in time for the Air Conditioning season here in the Northern Hemisphere. Too bad they're "Not Quite Ready for Prime Time" as this would have been an excellent candidate for the BOINC Pentathlon GPU event. ![]() AMD - 2600x, 2 x 2700, 2700x, 3900x, 3950x, 2 x 5900x, 5950x Intel - E3-1231v3, 9900K NVidia - GTX 1060 6GB, 1660ti, 1070ti; RTX 2060, 2060s, 2070a, 5 x 2070s |
||
|
maeax
Advanced Cruncher Joined: May 2, 2007 Post Count: 142 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No problem with Einstein@Home! And Einstein has longer running tasks which might not expose the issues with your SSD. You can’t really compare apples and oranges. Like I said, I’m processing at a MUCH higher volume on OPNG, with no SSD issues. If it was a generic SSD issue, someone like me with many more writes would see this issue too, but we don’t. That points to your issue being related to something with your system specifically. Nothing changed on Hardware-side, but OPNG-Tasks are running well since a few hours. Don't know why, but is ok :-)
AMD Ryzen Threadripper PRO 3995WX 64-Cores/ AMD Radeon (TM) Pro W6600. OS Win11pro
|
||
|
Dayle Diamond
Senior Cruncher Joined: Jan 31, 2013 Post Count: 452 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Uplinger, I've been reporting periodically getting "Scheduler request failed: Couldn't connect to server" since the GPU project entered Beta.
|
||
|
Unixchick
Veteran Cruncher Joined: Apr 16, 2020 Post Count: 1000 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() |
Got an error with this WU ( OPNG_0015013_00013_4) I wasn't the only one to error, so I'm guessing it is the WU and not me.
|
||
|
|
![]() |