Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 156
Posts: 156   Pages: 16   [ Previous Page | 7 8 9 10 11 12 13 14 15 16 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 341130 times and has 155 replies Next Thread
genes
Advanced Cruncher
USA
Joined: Jan 28, 2006
Post Count: 132
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

As of this morning, several tasks have completed and now show the expected CPU time in the range of 6 hours, instead of 2 hours. Haven't actually watched to see if they go to 100%, will try during the day.

(Edit) this was after replacing Ubuntu 10.04 with Ubuntu 9.04 and the older kernel.
----------------------------------------
[Edit 1 times, last edit by genes at Oct 25, 2010 3:02:42 PM]
[Oct 25, 2010 3:01:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Shoot me to tears... stopped client, did swapoff -a, checked in the system monitor that no swap space is there, start the client and still the tasks revert to virtual memory using. I'm not getting it. Leaves only the checkpoint writing and job change overs as an IO factor.

Intresting. So far my current Linux device that's only crunching CEP2 has been doing well. Swap is turned on, I check it's usage every few days using "top" and it's been at 0k used with 2 GB RAM. Maybe I've got the right combination of hardware and software that works well together. I found this article Finding Performance Bottlenecks in Linux , will run some checks to see what performance load/wait/bottleneck I see and to what degree./quote]
My sub-folder of Linux > Performance is ever expanding :Thumbsup: Any takers on that BFS patch I posted a link to few up in this thread?

Yeah, I think it's just the way the science app splits the model store. Before my Top and System monitor said only 62MB swap space was used and 1.4GB of RAM, which has been a constant for weeks. Now is has for hours logged zero (0) Swap space and 1.5GB of RAM use, so the little swap there was shifted. Seem to *feel* a slight responsiveness improvement with Firefox and other activities. Even did a remote terminal session into a Windows machine and that continued to run fine.

But, all in all it has zero impact on the IO gap time. The last 3 that validated fine:

6.19 cep2 E200482_095_A.26.C18H11N7S.23.0.set1d06_1 03:52:23 (03:30:54) 25-10-2010 17:00 25-10-2010 17:19 Reported: OK
6.19 cep2 E200454_813_A.24.C20H14N2S2.328.3.set1d06_2 07:35:49 (06:53:45) 25-10-2010 15:51 25-10-2010 15:57 Reported: OK (u)
6.19 cep2 E200453_196_A.25.C18H11N5S2.290.2.set1d06_2 07:43:22 (07:09:37) 25-10-2010 11:49 25-10-2010 11:55 Reported: OK (u)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 25, 2010 3:46:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RaymondFO
Veteran Cruncher
USA
Joined: Nov 30, 2004
Post Count: 561
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Shoot me to tears... stopped client, did swapoff -a, checked in the system monitor that no swap space is there, start the client and still the tasks revert to virtual memory using. I'm not getting it. Leaves only the checkpoint writing and job change overs as an IO factor.

Intresting. So far my current Linux device that's only crunching CEP2 has been doing well. Swap is turned on, I check it's usage every few days using "top" and it's been at 0k used with 2 GB RAM. Maybe I've got the right combination of hardware and software that works well together. I found this article Finding Performance Bottlenecks in Linux , will run some checks to see what performance load/wait/bottleneck I see and to what degree./quote]
My sub-folder of Linux > Performance is ever expanding :Thumbsup: Any takers on that BFS patch I posted a link to few up in this thread?

Yeah, I think it's just the way the science app splits the model store. Before my Top and System monitor said only 62MB swap space was used and 1.4GB of RAM, which has been a constant for weeks. Now is has for hours logged zero (0) Swap space and 1.5GB of RAM use, so the little swap there was shifted. Seem to *feel* a slight responsiveness improvement with Firefox and other activities. Even did a remote terminal session into a Windows machine and that continued to run fine.

But, all in all it has zero impact on the IO gap time. The last 3 that validated fine:

6.19 cep2 E200482_095_A.26.C18H11N7S.23.0.set1d06_1 03:52:23 (03:30:54) 25-10-2010 17:00 25-10-2010 17:19 Reported: OK
6.19 cep2 E200454_813_A.24.C20H14N2S2.328.3.set1d06_2 07:35:49 (06:53:45) 25-10-2010 15:51 25-10-2010 15:57 Reported: OK (u)
6.19 cep2 E200453_196_A.25.C18H11N5S2.290.2.set1d06_2 07:43:22 (07:09:37) 25-10-2010 11:49 25-10-2010 11:55 Reported: OK (u)





I will install the patch tonight on the most up to date Ubuntu OS and see how it runs. I still have that spare hard drive to work with.
[Oct 25, 2010 5:05:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
genes
Advanced Cruncher
USA
Joined: Jan 28, 2006
Post Count: 132
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Here's one that just finished: E200482_ 154. Mine is the one reporting 6.80 hours. I lost only a minute or two here and there. The other one is typical of what I was getting before.

(edit) Percent done went to about 54%.
----------------------------------------
[Edit 1 times, last edit by genes at Oct 25, 2010 5:36:32 PM]
[Oct 25, 2010 5:34:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
codes
Advanced Cruncher
Joined: Oct 20, 2009
Post Count: 142
Status: Offline
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Running CEP2 24x7, some results of my I/O bottleneck/performance check:

CPU time: 12 Hours
Elapsed time: 12.4 Hours

<result>
<name>E200484_510_A.24.C19H15NOSSi2.11.0.set1d06_1</name>
<final_cpu_time>43200.206700</final_cpu_time>
<final_elapsed_time>44911.250560</final_elapsed_time>
<exit_status>0</exit_status>
<state>4</state>
<platform>i686-pc-linux-gnu</platform>
<version_num>619</version_num>
<stderr_out>

I ran the performance monitoring tools "sysstat" every 10 minutes for a period of 19 hours, below are the ending average results for that period. I think they look excellent.

07:07:01 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
Average: all 0.03 97.40 2.50 0.02 0.00 0.05

0.02% = Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

07:07:01 PM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
Average: 841491 1206097 58.90 258545 689398 332590 8.01

kbcommit and %commit = Amount and percentage of memory needed for current workload.

07:07:01 PM  pswpin/s pswpout/s
Average: 0.00 0.00

0.00 = Total number of swap pages the system brought in/out per second.

07:07:01 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
Average: dev8-0 (sda) 1.17 0.07 478.57 407.92 0.26 225.47 10.61 1.24

225.47 = The average time (in milliseconds) for I/O requests issued to the device to be served.
1.24% = Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.
----------------------------------------
[Edit 1 times, last edit by codes at Oct 27, 2010 12:31:17 AM]
[Oct 26, 2010 11:47:37 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Just 0.4 hours on 12.4 going to other things is plain excellent. Missed to read or remember if this is a single/multi core device, with without HT on.

If the one beta 6.33 received for Windows is a sign, the 17 minutes gap my duo had on top of the full 12 hour stretch, then I'm having a restrained hope that it will go away for the most part. Also, if in production and running a mix, the task frequency number that would get interspersed gets so low that 1/2 will run concurrently on a quad, is would be just about right what the doctor prescribes. In that the 1 per device "in progress" for duos was perfect... just wish it would then always automatically backfill and could be manipulated to skip ahead of the queue, which a shorter than HCC deadline would nicely facilitate, combined with some switch/connect knobbing of the client.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 27, 2010 8:33:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
codes
Advanced Cruncher
Joined: Oct 20, 2009
Post Count: 142
Status: Offline
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Just 0.4 hours on 12.4 going to other things is plain excellent. Missed to read or remember if this is a single/multi core device, with without HT on.

It's a 1.8 GHz Celeron (model 430), single core, 512 KB L2 cache, no HT, no OC. 2 GB DDR2 533 MHz RAM. 80 GB sata HD (don't remember the cache size). Biostar P4M890-M7 TE motherboard. It's a budget computer I put together a couple of years ago.

I've kept the sysstat data collector running, I'll check the numbers again in about 2 to 3 days after some more WUs get processed.

There were a few minor %iowait spikes during the 19 hour collection: 0.11, 0.16, 0.17, nothing major though. I don't remember what the %util spikes were during that period. I'll check it periodically.
[Oct 27, 2010 2:12:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Can't say I understand everyone in this thread but it strikes me that the difference in CPU Time and Elapsed Time is due to an app bug which occurs at the end of WU tasks (a work unit is composed of 16 tasks); when each of the tasks completes their run time and CPU time are just not tallied correctly (perhaps a timer is reset before being tallied due to a timeout). Over the run of the WU you might see 16 spikes/troughs corresponding to these events. Such time tally errors would be more likely to occur the more CEP2 tasks are run at the same time.
----------------------------------------
[Edit 1 times, last edit by skgiven at Oct 28, 2010 11:25:29 AM]
[Oct 28, 2010 11:21:48 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

Took some digging how to move the data_dir on Linux from the default location but it's now on a USB 8GB drive formatted in ext4 which tests show to be superior in performance.

Files to edit making sure all 3 point to the /media/boincdrive/boinc folder:

/etc/default/boinc-client
/etc/init.d/boinc-client
/etc/boinc-client/cc_config.xml

The one thing I'm a bit worried about is the mounting upon booting which for whatever reason is not immediate, so build in a 60 second delay in the cc_config.xml that remains on the HD. The <option> lines added to this file are:

<data_dir>/media/boincdrive/boinc/</data_dir>
<start_delay>60</start_delay>

While working towards this, blundered on 1 point. Forgot to recopy the old data_dir AGAIN to the new data_dir before starting core client the last time... was dumb not to take the client off-line so client while researching fetched more work. The server/client connect counter was then off and all work got removed and new work assigned. Quota 4, though the device has retained the same ID. Now the first task has to be valid or the client will be idling by this evening... for WCG.

In all this, make sure that all files after moving to the new location are still owned by the ''boinc'' account and are permissioned to the ''boinc'' group (where your user account should be included to be able to run BOINC Manager). Navigate in terminal window to the new directory location and issue 2 commands:

sudo chown -R -v boinc ./boinc/
sudo chgrp -R -v boinc ./boinc/

Set the Write to Disk to 3600 seconds for now to minimize the writes and wear of the USB stick (Sandisk Cruzer)

That's it (not exhaustively), but if there are more than a few interested, I'll write it up in more detailed step-by-step. Let's though first things first determine:

A) results reach at least the Pending Validation state... 3 of 4 are Clean Water so they might go inconclusive.
B) the one CEP2 task does same and shows less IO gap time.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Oct 28, 2010 11:28:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
sk..
Master Cruncher
http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif
Joined: Mar 22, 2007
Post Count: 2324
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Poor crunching on this project

What do you mean by "IO gap time"?
[Oct 28, 2010 12:20:06 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 156   Pages: 16   [ Previous Page | 7 8 9 10 11 12 13 14 15 16 | Next Page ]
[ Jump to Last Post ]
Post new Thread