Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 7
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1179 times and has 6 replies Next Thread
giddie
Cruncher
UK
Joined: Nov 21, 2006
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
confused Linux Boinc clients heavy disk activity

Boinc 6.12.34
ArchLinux x86-64

I'm trying to set up BOINC clients on diskless nodes that are part of a cluster. That means that all disk I/O has to go over a network to an NFS server.

The problem I'm seeing is that the clients are not respecting the disk_interval setting (I've tried increasing it to 600, with no effect). Instead, in each slot directory, I see a couple of files that are being written at least once a *second*:

boinc_mmap_file is written in this way in each slot. The other file is different for each slot:

boinc_dsfl_0
boinc_gfam_1
boinc_dsfl_2
boinc_c3cw_3

Any idea why Boinc is going crazy with these files? I'm not seeing this on otherwise identical systems (that have disks).
[Feb 22, 2012 1:57:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Linux Boinc clients heavy disk activity

Whatever you see, can't be right. The only science not following the "Write To Disk" limit is CEP2, and that one has only 16 checkpoints maximum on a run time of up to 12 hours. WtD is not "Run based on preferences" dependent... it's one of these settings that always works.

How many nodes are writing back to your NFS server and how many concurrent threads are running BOINC. If thousands, then yes, that could get into the once per second, but only at the NFS server end, not the individual node threads.

And for sure, advise strongly not to run CEP2 on your setup. That one is likely to kill efficiency, but the experts are invited to contradict and explain how so.

One thought; some firewalls have the habit of showing localhost traffic. That never leaves the node, but don't expect you to run security software at the device level.

--//--

edit: Is this 6.12.34 a build by the distro or one fetched from Berkeley?
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 22, 2012 2:16:01 PM]
[Feb 22, 2012 2:14:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
giddie
Cruncher
UK
Joined: Nov 21, 2006
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Linux Boinc clients heavy disk activity

The BOINC build is from the distro (boinc-nox). The same package doesn't show the same problem on "normal" installations.

I am testing this with:

# cd /var/lib/boinc/slots/0
# watch -n1 ls -lat --full-time

The top two files show an updated timestamp every time watch updates.

The problem for the network is the *number* of these small file writes. I'm not sure exactly how many writes are occurring per second, but it's clearly at least once per file per slot per machine.

I'll look into CEP2; thanks for the tip. Our use case for this cluster involves pretty large I/O anyway, and simple tests show pretty decent throughput even with multiple nodes writing simultaneously.
[Feb 23, 2012 11:24:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
mikey
Veteran Cruncher
Joined: May 10, 2009
Post Count: 826
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Linux Boinc clients heavy disk activity

Boinc 6.12.34
ArchLinux x86-64

I'm trying to set up BOINC clients on diskless nodes that are part of a cluster. That means that all disk I/O has to go over a network to an NFS server.

The problem I'm seeing is that the clients are not respecting the disk_interval setting (I've tried increasing it to 600, with no effect). Instead, in each slot directory, I see a couple of files that are being written at least once a *second*:

boinc_mmap_file is written in this way in each slot. The other file is different for each slot:

boinc_dsfl_0
boinc_gfam_1
boinc_dsfl_2
boinc_c3cw_3

Any idea why Boinc is going crazy with these files? I'm not seeing this on otherwise identical systems (that have disks).


Check out the Dotsch Linux on a Server or a USB disk, he writes it himself and should be a helpful resource. Do a search and you will find him.
----------------------------------------


[Feb 23, 2012 1:06:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
giddie
Cruncher
UK
Joined: Nov 21, 2006
Post Count: 29
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Linux Boinc clients heavy disk activity

Thanks, but that's not quite suited to my setup. I already have a diskless environment set up, and I'm hoping to run BOINC on the nodes to harvest wasted time when the cluster is up but we have no jobs to run.

I'd really appreciate some ideas as to why these files might be written so frequently, or other tests I might run to figure out what's going on.
[Feb 28, 2012 3:48:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Linux Boinc clients heavy disk activity

I know at least 2 guys that run a render and computing for rent farm, former in a PXE setup, diskless FAIK. The guy with the render farm saves the nodes client "as is" when he's got a job and reloads them when ever they're free to BOINC. Never heard him on a perpetual disk writing to a staging drive, in fact, why would it act any different from on a local host. The WTD is adhered to, but that said, maybe he's using a RAMDISK type setup with minimal work queue to contain the memory needs. Here some google hits on BOINC on diskless nodes: http://www.google.it/search?q=BOINC+on+diskle...cial&client=firefox-a

Sorry, but not much of a help on this. You could always send a message to support@worldcommunitygrid.org f.a.o techs. When talking hundreds of devices/cores, they will stretch to assist you as best they can.

--//--
----------------------------------------
[Edit 1 times, last edit by Former Member at Feb 28, 2012 4:05:41 PM]
[Feb 28, 2012 4:03:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Linux Boinc clients heavy disk activity

I had a setup with 4 diskless nodes at one time. Currently, I run just 2 diskless nodes for BOINC. I have experimented with Linux on diskless nodes(note: I am basically a linux noob). All of the machines use a shared folder from a Ramdrive on the PXE server to store the BOINC WUs. This works REALLY well for dedicated CEP2 crunching(which I do when I'm not trying to get badges). I had multiple "odd" issues with my experiments using both Linux Mint 12 and Ubuntu 11.10. I could never identify the exact issues but I assume it is related to the way in which file shares work using samba(The PXE server is a windows server 2008 R2 machine). The windows machines on the other hand, crunch like there's no tomorrow.

I will tell you that using my RAMDRIVE as a file share from the PXE server, I was able to run 40 concurrent threads of CEP2 simultaneously using gigabit LAN with no issues whatsoever. Based on the data I had collected the potential existed to run as many as 80 threads of CEP2 simultaneously. However CEP2 requires alot of disk space compared to the other work units and since I had only 32GB of space on my RAMDRIVE 40 threads was about the limits of what I could handle without some of the machines chocking and not receiving new work units because less than 2GB of free space was available.
[Feb 29, 2012 10:22:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread