Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 369
Posts: 369   Pages: 37   [ Previous Page | 20 21 22 23 24 25 26 27 28 29 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 341284 times and has 368 replies Next Thread
PecosRiverM
Veteran Cruncher
The Great State of Texas
Joined: Apr 27, 2007
Post Count: 1053
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

Looks like they went really fast too.
----------------------------------------

[Jan 28, 2010 12:34:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
I need a bath
Senior Cruncher
USA
Joined: Apr 12, 2007
Post Count: 347
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

Gosh, I hope this beta goes well. I really am looking forward to DDD2.
----------------------------------------

[Jan 28, 2010 12:38:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
GIBA
Ace Cruncher
Joined: Apr 25, 2005
Post Count: 5374
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

Got some of this new ones updated and released.
Hope that all goes well in a smooth way, despite be a Beta test... cool good luck coffee
----------------------------------------
Cheers ! GIB@ peace coffee
Join BRASIL - BRAZIL@GRID team and be very happy !
http://www.worldcommunitygrid.org/team/viewTeamInfo.do?teamId=DF99KT5DN1

[Jan 28, 2010 12:55:10 AM]   Link   Report threatening or abusive post: please login first  Go to top 
HutchNYC
Advanced Cruncher
United States
Joined: Nov 27, 2005
Post Count: 97
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

I have a few questions about my current settings as it applies to these beta's.

The pc I'm currently using is an i7-920 with 4GB running Vista-64bit. I run bonic 24/7 with WCG as the only project. This machine has been running this way for almost a year now.

I usually can browse the internet, check mail, work on spreadsheets, etc. while crunching and have never had any noticeable lag or slowdown while doing this. When I'm running these DDD2 Beta's though, there is a VERY significant slowdown.

The HD also seems to be in a perpetual read/write state. I'm not sure if this is due to the pagefile/VM activity because of the higher memory requirements, or because of my checkpointing settings. (Most likely both)

I had always ran with the checkpoint at most every 10 seconds setting. This is no doubt quicker than I really need it to be as the pc stays on 24/7, but it had never been an issue before.

I bumped the checkpoint up to 120 seconds, but I don't see any improvement in system performance.

This isn't a big deal while a few beta's are running, but if users have DDD2 selected in their project mix I can see the potential problem of complaints that "WCG is causing my system to crawl".

Any ideas/thoughts/suggestions on the best settings for us to use once this project goes into regular production?

I'm not complaining. Just hoping you might have some suggestions that might make the lag less noticeable.

Thanks,
Hutch
----------------------------------------
[Jan 28, 2010 1:40:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
smeyer55
Senior Cruncher
Joined: Feb 15, 2009
Post Count: 303
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

I got some on my I7-920. It looks like they're going to take about 3 hours to run.
I'm also seeing disk activity every few seconds with 8 betas running at the same time. I don't notice any system slowdown though.

steve
[Jan 28, 2010 2:00:01 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2977
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

Good news, :)

Credit should have been given to 672 results. These were the work units for the type B.se that were cancelled a few hours ago.

Thanks,
-Uplinger


Thanks for that Uplinger - I know you didn't have to do it (after all, situations like this are all part of the "fun" of Beta testing), although it's much appreciated biggrin
----------------------------------------

[Jan 28, 2010 3:45:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MrWizard
Cruncher
Joined: Nov 16, 2004
Post Count: 4
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

There is too much disk activity by this program. You need to buffer your writes to solv*.trj and solv*.rst and not write to them constantly. There is no way I would let a production application run like this on my computers.
[Jan 28, 2010 5:28:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Ingleside
Veteran Cruncher
Norway
Joined: Nov 19, 2005
Post Count: 974
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

I had always ran with the checkpoint at most every 10 seconds setting. This is no doubt quicker than I really need it to be as the pc stays on 24/7, but it had never been an issue before.

I bumped the checkpoint up to 120 seconds, but I don't see any improvement in system performance.

This isn't a big deal while a few beta's are running, but if users have DDD2 selected in their project mix I can see the potential problem of complaints that "WCG is causing my system to crawl".

Any ideas/thoughts/suggestions on the best settings for us to use once this project goes into regular production?

An already started task won't get the new "write to disk"-setting before it's exited (removed from memory), so you'll likely still checkpointing once every 10 seconds.

Since each task checkpoints independently, this basically means you'll checkpointing once per second, and if you're not running a fairly new BOINC-client (v6.10.xx), this will trigger a re-write of client_state.xml, and this can kill performance if you've got many task (actually quite few for some of the WCG-sub-projects that uses lots of files per task...)

With v6.10.xx-clients the checkpointing goes to individual file per task, so there won't be a BOINC-overhead for frequent checkpointing... The application on the other hand can still write large files if it's programmed that way, and this can slow-down performance.

In any case, there's generally no reason to use less than 60 seconds for checkpoint-interwall, and if you continue having problems, try bumping-up to 10 minutes or something...
----------------------------------------


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
[Jan 28, 2010 5:48:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

I had always ran with the checkpoint at most every 10 seconds setting. This is no doubt quicker than I really need it to be as the pc stays on 24/7, but it had never been an issue before.

I bumped the checkpoint up to 120 seconds, but I don't see any improvement in system performance.

HutchNYC

1. Changing the interval takes effect upon restarting the client or on the next job. The Job once loaded into memory retains the initial setting of 10 second.

2. There's been a back and forth with the developers on writing system wide checkpoints i.e. in your case once per 10 seconds for the whole client and once per 10 seconds per job i.e. as outlined for 8 concurrent jobs that being, if there is a checkpoint to write, a frequency of one about every 1.25 seconds.

The default 60 second has a point. No one bothers too much about loosing a minute on system restart, if there is a checkpoint written every minute or less. I've got it on 5 minutes, to keep that disk whirring down. In your case with 8 cores that'd be a loss on average of 2.5 minutes per job IF the science is programmed and able to checkpoint frequently. You would not want to do that with large checkpoint files.

As for the overall project concern... there will not be too many B types per target. The bulk of the project is C types and as I understand it there will be a mix of ABC as the batches cycle through.

Edit: Had 4 running concurrently of the pe type... did not notice with 2.5 gb ram allowed for use by BOINC and LAIM on.

edit: with a 5 minute write setting MY client log looks like this (Ingleside does not like checkpoint logging... else he'd not be able to see the real problems ;-)

28/01/2010 07:32:43 World Community Grid [checkpoint_debug] result CMD2_0315-MYH14.clustersOccur-2IAE_B.clustersOccur_193_0 checkpointed
28/01/2010 07:33:38 World Community Grid [checkpoint_debug] result BETA_erlc_a218_pe0000_2 checkpointed
28/01/2010 07:33:45 World Community Grid [checkpoint_debug] result BETA_erlc_a189_pe0000_0 checkpointed
28/01/2010 07:37:21 World Community Grid [checkpoint_debug] result BETA_erlc_a215_pe0000_2 checkpointed
28/01/2010 07:38:05 World Community Grid [checkpoint_debug] result CMD2_0315-MYH14.clustersOccur-2IAE_B.clustersOccur_193_0 checkpointed
28/01/2010 07:38:52 World Community Grid [checkpoint_debug] result BETA_erlc_a189_pe0000_0 checkpointed
28/01/2010 07:38:54 World Community Grid [checkpoint_debug] result BETA_erlc_a218_pe0000_2 checkpointed
28/01/2010 07:42:37 World Community Grid [checkpoint_debug] result BETA_erlc_a215_pe0000_2 checkpointed
28/01/2010 07:43:08 World Community Grid [checkpoint_debug] result CMD2_0315-MYH14.clustersOccur-2IAE_B.clustersOccur_193_0 checkpointed
28/01/2010 07:43:56 World Community Grid [checkpoint_debug] result BETA_erlc_a189_pe0000_0 checkpointed
28/01/2010 07:43:59 World Community Grid [checkpoint_debug] result BETA_erlc_a218_pe0000_2 checkpointed
28/01/2010 07:47:44 World Community Grid [checkpoint_debug] result BETA_erlc_a215_pe0000_2 checkpointed
28/01/2010 07:48:17 World Community Grid [checkpoint_debug] result CMD2_0315-MYH14.clustersOccur-2IAE_B.clustersOccur_193_0 checkpointed
28/01/2010 07:48:58 World Community Grid [checkpoint_debug] result BETA_erlc_a189_pe0000_0 checkpointed
28/01/2010 07:49:13 World Community Grid [checkpoint_debug] result BETA_erlc_a218_pe0000_2 checkpointed
28/01/2010 07:52:45 World Community Grid [checkpoint_debug] result BETA_erlc_a215_pe0000_2 checkpointed
28/01/2010 07:53:22 World Community Grid [checkpoint_debug] result CMD2_0315-MYH14.clustersOccur-2IAE_B.clustersOccur_193_0 checkpointed
28/01/2010 07:53:59 World Community Grid [checkpoint_debug] result BETA_erlc_a189_pe0000_0 checkpointed
28/01/2010 07:54:24 World Community Grid [checkpoint_debug] result BETA_erlc_a218_pe0000_2 checkpointed
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Jan 28, 2010 7:00:42 AM]
[Jan 28, 2010 6:43:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: DDD2 Type B work units going out.

Can someone tell me, what this means:

BETA_ erlc_ a029_ se0000_ 1-- 614 Server Aborted 27.01.10 16:27:02 27.01.10 21:51:31 0.00 0.0 / 0.0
BETA_ erlc_ a029_ se0000_ 2-- 614 Server Aborted 27.01.10 16:27:02 27.01.10 23:59:27 0.00 0.0 / 0.0
BETA_ erlc_ a029_ se0000_ 0-- 614 Too Late 27.01.10 16:26:59 27.01.10 22:32:00 3.76 45.3 / 45.3 ยด<---- MINE

My result was returned 6 hours after sending out, and that is TOO LATE...??? sad confused
[Jan 28, 2010 11:33:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 369   Pages: 37   [ Previous Page | 20 21 22 23 24 25 26 27 28 29 | Next Page ]
[ Jump to Last Post ]
Post new Thread