Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 14
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3479 times and has 13 replies Next Thread
Boca Raton Community HS
Advanced Cruncher
Joined: Aug 27, 2021
Post Count: 118
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Advice for a new-ish group getting started

Good evening,

I was hoping this was the right spot to gain some advice. A little background:

I am a high school science teacher at a public school. Over a year ago, I had a crazy idea to bring a "slice of supercomputing" into my high school for students and teachers to use. I knew that I would never be able to actually set up an honest "supercomputer" that is massively parallel, but I believed that I could bring in some of the same hardware that could actually be found in various high-end computers/supercomputers. Students could utilize the workstation for ANY academic purpose (they can schedule times on it for AI, coding, design work, etc) and teachers could remote login during the day and use it for any classroom purpose. Free access for all.

I did not know how much money I could raise for this project, but we ended up being extremely successful (but I want the project to keep expanding!). The workstation is now fully operational and now we are working to implement it fully for student use. We are also putting in a 220v line to maximize electrical efficiency and trying our best to come up with a great cooling solution (it has to be in a secured case).

My goal of this project is to ALWAYS have this workstation benefitting someone, with an obvious focus on student/teacher use. That being said, it is too great of a tool to sit idle when not in use. Because I am a science teacher, I was already aware of WCG. I have formally received the "okay" to implement this software in classrooms in my school with a focus on the science of the projects and the computing aspects of it.

Although I have tested the workstation on the WCG, I am now working to "fine tune" the performance. We want to maximize the science being completed. I would love if there were more projects on the WCG that implemented the GPUs but I know there are other projects out there that do.

Here is my question- what advice can you all offer to maximize science output of this new workstation? This workstation has only one purpose- benefit the community and world. So, how can we best do that? Here are the specs:

Chassis: Dell Precision Tower 7920
CPU: DUAL Intel Xeon Gold 6258R 2.7GHz,(4.0GHz Turbo, 28 core) = 56 cores, 112 threads
RAM: 512GB DDR4 2933MHz ECC
GPU: DUAL Nvidia RTX A6000, 48GB each, soon NVLinked = 96GB
Storage: Four (4) M.2 1TB PCIe NVMe Class 50 Solid State Drive in RAID 10 along with a storage array that features ten (10) 1.92TB SATA AG Enterprise Solid State Drives in RAID 10
Display: 8K, 32 inches

I know some of you might wonder why I went Intel- we did not have much of a choice when it came to allowable options.

It's a monster. Let's do science! Thank you for any advice you can offer.
----------------------------------------
[Edit 1 times, last edit by Boca Raton Community HS at Oct 13, 2021 2:13:52 AM]
[Oct 13, 2021 2:12:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
thunder7
Senior Cruncher
Netherlands
Joined: Mar 6, 2013
Post Count: 232
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

What OS does it run? Linux is often more efficient for WCG work.
[Oct 13, 2021 4:41:13 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Advanced Cruncher
Joined: Aug 27, 2021
Post Count: 118
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

Right now we are running on Windows 10 Pro for Workstations. Students and teachers are more familiar with Windows than Linux so they are more likely to use the workstation. I have considered a dual boot option though. Much of the machine learning/ai software also prefers Linux. I have not ever set up a dual boot system but I don't think it is very difficult.
[Oct 13, 2021 10:40:57 AM]   Link   Report threatening or abusive post: please login first  Go to top 
adriverhoef
Master Cruncher
The Netherlands
Joined: Apr 3, 2009
Post Count: 2106
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

Right now we are running on Windows 10 Pro for Workstations. Students and teachers are more familiar with Windows than Linux so they are more likely to use the workstation.

So they are more familiar with Windows. If they really want to LEARN something, why not make them familiar with something else?

Linux is an important tool for serious computer users.

1. It's used on nearly every server
2. It's standard for development environments
3. It has a powerful native terminal and shell
4. It empowers users to solve problems
5. It doesn't limit user's access to critical systems
6. It has higher stability than other systems
7. It's open source and free
8. It's more secure
9. It's more flexible than anything else
10. No one is watching you, unless you want them to
11. You can brag to people on the internet

(Source: https://www.maketecheasier.com/reasons-learn-use-linux/)
[Oct 13, 2021 12:00:46 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7595
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

I have not ever set up a dual boot system but I don't think it is very difficult.

No, it is not very difficult. I would find another system on which to practice. I have done it a few times and it was as easy as following the prompts. You can be as sophisticated as you desire with partitions, virtual machines etc. or just let the install take care of getting it initially running.
Don't just confine yourself to the Windows universe. There are many comparable applications available for Linux and with many (not all) of them you get more flexibility and functionality.
Let us know how the project proceeds.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Oct 13, 2021 2:12:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Stiwi
Advanced Cruncher
Joined: May 19, 2012
Post Count: 73
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

Very nice system but since the CPU cache is not very high you shouldn't run many arp simultaneously.
On Windows there is a problem with more than 64 threads and boinc. As far as I know there is no solution.
https://boinc.berkeley.edu/trac/wiki/WinMulticore
----------------------------------------
[Edit 1 times, last edit by Stiwi at Oct 13, 2021 5:06:09 PM]
[Oct 13, 2021 5:05:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Advanced Cruncher
Joined: Aug 27, 2021
Post Count: 118
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

Hm I thought the cache size was adequate on these processors. L1 is 1.75mb, L2 is 28, L3 is 38.5. When I first start up boinc, it definitely does not use 100% of the cpu. After it runs for a while, it seems to do a better job assigning cores/threads. It is hard to tell if it is using them efficiently, but all threads/cores are running at 100% after a while. It runs 112 work units at the same time and boinc recognizes 112 cpus when I benchmark. It looks like it runs ARP in about 28 hours.
[Oct 13, 2021 5:28:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7595
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

With 28 hours for an ARP unit, I think you are running a bit slower than you are capable of running. You may be bottlenecked on cache or perhaps on on your internal I/O systems. My suggestion would be to run a mix of all the workunits and play with the ratios to try to optimize throughput. For a little insight read Amdahl's Law
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Oct 13, 2021 6:46:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Boca Raton Community HS
Advanced Cruncher
Joined: Aug 27, 2021
Post Count: 118
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

Is there a way to determine if it is the cache? There is usually not more than a few ARP work units running at any given time. Also, do you all think that there would be any improvement in performance based on the array it is installed on? I currently have it on the OS array (default install). I would not think this would be a bottleneck, but what do you all think based on the array info I have above? I can provide the info on the raid controllers if that helps.

Also, I know the memory might not be the fastest, it was the fastest ecc memory that Dell offered.
[Oct 14, 2021 12:49:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7595
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Advice for a new-ish group getting started

Is there a way to determine if it is the cache? There is usually not more than a few ARP work units running at any given time. Also, do you all think that there would be any improvement in performance based on the array it is installed on? I currently have it on the OS array (default install). I would not think this would be a bottleneck, but what do you all think based on the array info I have above? I can provide the info on the raid controllers if that helps.

Also, I know the memory might not be the fastest, it was the fastest ecc memory that Dell offered.

The only reason I think there is a bottleneck someplace is I have an I7-3770 (4 core - 8 threads) which finishes an ARP unit in around 19 hours running under Windows 7. Your chips are much newer. Right off hand I do not know of a way to check cache misses (which may not be the problem), One thing I did find is these chips take 6 channel memory rather than 8 channel. Maybe that has an effect. Might be clogging the memory to cpu I/O. Maybe check task manager to see what is being utilized.
How do your times look for the other projects ?
Maybe have somebody in your IT department look at the system and maybe they can find where clogs exist. I am pretty sure it has nothing to do with disk I/O.
Edit:spelling
Good Luck
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
----------------------------------------
[Edit 1 times, last edit by Sgt.Joe at Oct 14, 2021 3:05:14 AM]
[Oct 14, 2021 3:04:07 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 14   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread