Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 12
Posts: 12   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 3014 times and has 11 replies Next Thread
buscher
Cruncher
Joined: Oct 3, 2011
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
CEP2 high checkpoint times

Hello,

it seems like CEP2 WUs only do checkpoints each 5% progress?
After 4,5% and boinc restart(kill) I have 0% again.

While some WU's are super small... other are a hell lot bigger, so "sometimes", it takes more than 45min to reach the 5% mark. And believe it or not, sometimes I turn my pc off... so when doing CEP2 WU's I often lose a lot(!) of progress... can't this be reduced? or maybe changed to time based?

Or are my 5% mark observations wrong?
Would be great if someone could clarify this :)
I just want to make sure that I am not just wasting my cpu time by restarting boinc (in case I have to).

btw:
using boinc 7.2.42 on Linux
[Aug 6, 2014 12:13:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
branjo
Master Cruncher
Slovakia
Joined: Jun 29, 2012
Post Count: 1892
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Hi busher,

It would be great if CEP2 would checkpoint every 5% rolling eyes . In reality, it is huge gap between checkpoints in the middle of the WU's (IIRC after the first one) - it is a nature of the science and nothing can be done with it sad .

So if you have to turn your PC regularly off (or restart BOINC), it is better not to crunch it - as you correctly stated, it is wasting of your resources. Rather concentrate on MCM1 and FAAH.

Cheers and good luck

ETA: more at http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=11332
----------------------------------------

Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006

----------------------------------------
[Edit 1 times, last edit by branjo at Aug 6, 2014 1:34:11 PM]
[Aug 6, 2014 1:27:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Hello buscher,
CEP2 has 16 jobs to eun. At the end of each job, it check points the results. Each job is a different type of calculation. The really long jobs are the third and the ?thirteenth? These 2 jobs take up more than half the time. The other check points are at short intervals. This peculiar check pointing is a function of the program design. We put up with it or we don't run CEP2. A number of crunchers avoid CEP2 because they need more frequent check pointing.

Lawrence
[Aug 6, 2014 1:35:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 328
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

The content of the E225xxx series jobs has changed. They have only 8 jobs in them. The work units used to complete in about 2 hours but now take 6 or 7 on an i7-2600k. The long checkpoints occur at the start and end. The checkpoint periods in minutes for one job were 214, 10, 14, 13, 10, 8, 161 after 0x1 exit; the final job was skipped.
[Aug 6, 2014 2:03:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1411
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Or are my 5% mark observations wrong?
Would be great if someone could clarify this :)

A bit wrong. A checkpoint is made after each job.
When all 16 jobs have to be done the percentage of the total run time of the jobs 0 up to 15 is about:

Finished Job #0	  0.50%
Finished Job #1 1.28%
Finished Job #2 22.23%
Finished Job #3 1.71%
Finished Job #4 1.16%
Finished Job #5 1.18%
Finished Job #6 1.08%
Finished Job #7 1.56%
Finished Job #8 0.95%
Finished Job #9 1.23%
Finished Job #10 2.65%
Finished Job #11 1.46%
Finished Job #12 8.76%
Finished Job #13 16.53%
Finished Job #14 16.57%
Finished Job #15 21.13%

----------------------------------------
[Edit 2 times, last edit by Crystal Pellet at Aug 6, 2014 5:17:29 PM]
[Aug 6, 2014 5:10:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
buscher
Cruncher
Joined: Oct 3, 2011
Post Count: 5
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Thanks for those great answers! You Rock! :D

But I guess that means I have to disable CEP2 for now, as my "runtimes" are rather "unstable".
[Aug 7, 2014 7:11:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 3010
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

But I guess that means I have to disable CEP2 for now, as my "runtimes" are rather "unstable".
yes, unfortunately it may be so (and as intimated at above, that's one reason why CEP2 is an opt-in project).

Thankfully though, here at WCG we do have two other great projects and another on the way...
----------------------------------------

[Aug 7, 2014 12:05:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
cjslman
Master Cruncher
Mexico
Joined: Nov 23, 2004
Post Count: 2082
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Thankfully though, here at WCG we do have two other great projects and another on the way...
... and at least 1 more coming down the road... the beta forum was going bananas last week wink .

CJSL

Gotta keep crunching, there's a world to save !!!
----------------------------------------
I follow the Gimli philosophy: "Keep breathing. That's the key. Breathe."
Join The Cahuamos Team


[Aug 7, 2014 1:15:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Hi guys!

Just to confirm, the content of the work units has changed a little bit, starting at E225XXX series. I have ordered the jobs within a work unit in such a way that after the first job (which all of the other jobs depend on), they get 'harder' the further on they get. This is to allow the maximum amount of work to be done in any given timeslot (i.e. if the jobs hit the time limit). I am also trying to make sure the jobs sit within an acceptable time slot for both you guys and the guys at IBM. I was under the impression that they checkpointed after each job, but if there is any confusion about this I can go to the IBM techs and confirm :)

Your Harvard CEP Team
[Aug 8, 2014 12:33:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1411
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: CEP2 high checkpoint times

Hi guys!

Just to confirm, the content of the work units has changed a little bit, starting at E225XXX series. I have ordered the jobs within a work unit in such a way that after the first job (which all of the other jobs depend on), they get 'harder' the further on they get. This is to allow the maximum amount of work to be done in any given timeslot (i.e. if the jobs hit the time limit). I am also trying to make sure the jobs sit within an acceptable time slot for both you guys and the guys at IBM. I was under the impression that they checkpointed after each job, but if there is any confusion about this I can go to the IBM techs and confirm :)

Your Harvard CEP Team

Changed a little bit ... ? The major job now is Job #0 lasting up to 10 hours without checkpointing.
Three tasks with their checkpoint intervals and job-duration inside the tasks:

dd-mm-yyyy hh:mm:ss	Task		                                  Hours
08-08-2014 18:50:47 Starting task E225091_541
09-08-2014 03:27:33 [checkpoint] result E225091_541 checkpointed Job #0 8.61
09-08-2014 04:26:06 [checkpoint] result E225091_541 checkpointed Job #1 0.98
09-08-2014 04:56:20 [checkpoint] result E225091_541 checkpointed Job #2 0.50
09-08-2014 05:26:13 Computation for task E225091_541 finished Job #3 0.50
exited with RC = 0x24 Skipping Job #4, #5, #6 and #7
Wingman finished Job #6 - exited with RC = 0x1 and skipped Job #7


08-08-2014 18:48:06 Starting task E225090_885
09-08-2014 04:14:48 [checkpoint] result E225090_885 checkpointed Job #0 9.44
09-08-2014 04:50:32 [checkpoint] result E225090_885 checkpointed Job #1 0.60
09-08-2014 05:26:43 [checkpoint] result E225090_885 checkpointed Job #2 0.60
09-08-2014 06:06:00 [checkpoint] result E225090_885 checkpointed Job #3 0.65
09-08-2014 06:35:36 [checkpoint] result E225090_885 checkpointed Job #4 0.49
09-08-2014 07:09:59 [checkpoint] result E225090_885 checkpointed Job #5 0.57
09-08-2014 12:46:43 Computation for task E225090_885 finished Job #6 5.61
exited with RC = 0x1 Skipping Job #7
Wingman in Progress


08-08-2014 18:53:10 Starting task E225089_332
09-08-2014 02:26:43 [checkpoint] result E225089_332 checkpointed Job #0 7.56
09-08-2014 03:04:03 [checkpoint] result E225089_332 checkpointed Job #1 0.62
09-08-2014 03:43:21 [checkpoint] result E225089_332 checkpointed Job #2 0.65
09-08-2014 04:29:18 [checkpoint] result E225089_332 checkpointed Job #3 0.77
09-08-2014 05:03:41 [checkpoint] result E225089_332 checkpointed Job #4 0.57
09-08-2014 05:35:13 [checkpoint] result E225089_332 checkpointed Job #5 0.53
09-08-2014 09:27:40 Computation for task E225089_332 finished Job #6 3.87
exited with RC = 0x1 Skipping Job #7
Wingman stopped after the same Job

[Aug 9, 2014 12:24:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 12   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread