Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 5
|
![]() |
Author |
|
GameboyRMH
Cruncher Joined: Jan 31, 2018 Post Count: 3 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hey all,
Recently I've started working at a company where part of the job is stress testing custom-built computers for stability and cooling performance before they're sent to customers. Right now they're simply running a CPU stress test which, as a computing task, is quite useless. I'm trying to come up with an improved stress testing procedure and as part of that, I would like to have the computers do some kind of useful work, and WCG tasks in BOINC are a prime candidate. However the nature of BOINC tasks doesn't mesh well with the requirements for this test and it will require some creative thinking to make it work. The test must be: - Entirely portable - long-term installation or individual configuration are deal-killers, this has to be a plug & play operation that can be started with a few clicks and leaves the computers as they were before. Portable applications that can run from a removable device (perhaps with help from networked storage) or a network-booted environment are both options. - No long-term commitment - testing may run for as little as 1 hour or as long as overnight. Considering the length of BOINC tasks, this means that a task may have to be started on one machine and continued on another. We also want a quick start, no more than 5 minutes of waiting before processing begins. - Automatically fitting the host machine - these computers have different numbers of apparent cores and we want them all to be used. We also want to use lots of RAM and may intentionally tie up RAM unused by BOINC tasks with some other RAM test such as a memtester job. - Scalable - the number of computers testing at one time generally ranges from one to six right now, jobs that have been started should continue to be processed by any available computers to minimize the chances of a started job expiring before it's finished. Some ideas I've come up with so far: 1. Network-booting into a diskless beowulf cluster node. Low-friction and meets all the goals, but huge initial setup effort, and inefficient. 2. Network-booting into one of a number of regular desktop GNU/Linux installations with BOINC installed, with some auto-config scripts. Relatively easy and efficient, but interferes somewhat with the "no long-term commitment" and scalability goals. 3. A number of portable BOINC installs with auto-config scripts. Similar to the previous option but even less elegant :-P Any more ideas? Could BoincTasks be helpful for controlling job distribution with ideas 2 or 3 to keep jobs from sitting idle when a Linux netboot install isn't booted / portable agent isn't running? |
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7745 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I would suggest putting a linux installation on several flash drives. each flash drive can be used in one machine. Once BOINC is installed and configured once, you may be able to use the same installation on multiple machines. When you do the initial configuration use short queues, enabling just enough tasks to engage every thread available. Stick to MIP and MCM as they are the shortest units.
----------------------------------------The Linux sticks boot quickly and you do not even need any hard drives to be engaged. I use Linux sticks on several systems without hard drives, but I have not tried switching the sticks to different machines yet. Good luck Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Any more ideas? Could BoincTasks be helpful for controlling job distribution with ideas 2 or 3 to keep jobs from sitting idle when a Linux netboot install isn't booted / portable agent isn't running? The key 'manager' differential between BOINC's Manager and BOINCTasks is latter's ability to concurrently be connected to multiple clients. In a dual boot environment, for remote devices it will see which one is online and report the other instance as 'not connected'. That's it, it can do activities on a per-client basis but nothing for distributing jobs. The BOINC clients themselves, to which BT is connecting, would get highly upset since they do a continues inventory of files they have received from projects where the client_state.xml is pivotal. Short for, BT cant see/do anything unless a client is online. |
||
|
hchc
Veteran Cruncher USA Joined: Aug 15, 2006 Post Count: 825 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
GameboyRMH said:
----------------------------------------The test must be: - Entirely portable - long-term installation or individual configuration are deal-killers, this has to be a plug & play operation that can be started with a few clicks and leaves the computers as they were before. Portable applications that can run from a removable device (perhaps with help from networked storage) or a network-booted environment are both options. - No long-term commitment - testing may run for as little as 1 hour or as long as overnight. Considering the length of BOINC tasks, this means that a task may have to be started on one machine and continued on another. We also want a quick start, no more than 5 minutes of waiting before processing begins. - Automatically fitting the host machine - these computers have different numbers of apparent cores and we want them all to be used. We also want to use lots of RAM and may intentionally tie up RAM unused by BOINC tasks with some other RAM test such as a memtester job. - Scalable - the number of computers testing at one time generally ranges from one to six right now, jobs that have been started should continue to be processed by any available computers to minimize the chances of a started job expiring before it's finished. I think distributed.net's RC5-72 project fits the bill for CPU stress testing and meets all your requirements above except for the RAM part. Their client app is portable and just unzips into a folder, and you can put this folder on a USB flash drive. The RC5-72 cryptology project supports AVX instructions, which tends to stress CPUs pretty well and increase the heat output. They have another project called OGR-28, but the work units aren't the same "size" and can take anywhere from minutes to hours. With RC5-72, you can say, "only fetch 30 minutes of work at a time" or "only fetch 60 minutes of work at a time" or "fetch 8 hours of work at a time" and the ETA is pretty accurate. You can use the same portable app and unfinished work units on a different machine, which may be useful for your use case. On startup, it simply detects a new CPU and works accordingly. It doesn't use a lot of RAM as they've really developed for efficiency, so you'll have to use another memtest or something to really stress test RAM. distributed.net is one of the original distributed computing projects that started from the 1990s. If you want to get more complex, distributed.net does support creating a proxy server that downloads all the work units and all your computers fetch work from this proxy server only instead of directly from the Internet. Up to you if you want to read up on how to set all that up. Another original project is GIMPS - Great Internet Mersenne Prime Search. Their Prime95 application I think is portable and is also well known for stress-testing CPUs. It also supports AVX instructions. I'm not familiar with GIMPS since I've never used it, but it's popular for stress testing. With BOINC projects, you'll create a new device/host ID for each computer you run it on, which may or may not annoy you if you like clean statistics. If you test 100 computers for just an hour each, you'll have 100 new device/host IDs with barely any work done. I think that's sloppy, but I tend to be somewhat OCD with clean stats.
[Edit 2 times, last edit by hchc at Jan 16, 2020 10:50:27 AM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi,
Are there some Linux Images for the Linux Config, Installation Templates, or checklists you would suggest? Thanks Digi |
||
|
|
![]() |