| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 15
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi all,
I'm using Boinc 5.8.16 & using Boinc manager . I tried this thread but in the install under GNU/Linux there is a separate projects directory & a separate slots directory. There have been only 3 checkpoints which were shown in the messages window that too in the morning at 11:00 a.m. I exited the program in the evening in the hopes that when I would start again it would make a checkpoint or give any indication. I even suspended the task but while that sent me back by couple of percentage points there is/was no indication of when the last checkpoint happened or is happening. Tuesday 15 May 2007 11:08:33 PM IST||Starting BOINC client version 5.8.16 for i686-pc-linux-gnu Tuesday 15 May 2007 11:08:33 PM IST||log flags: task, file_xfer, sched_ops, checkpoint_debug Tuesday 15 May 2007 11:08:33 PM IST||Libraries: libcurl/7.16.0 OpenSSL/0.9.8d zlib/1.2.3 Tuesday 15 May 2007 11:08:33 PM IST||Data directory: /home/shirish/boinc/BOINC Tuesday 15 May 2007 11:08:33 PM IST||Processor: 1 GenuineIntel Intel(R) Pentium(R) 4 CPU 1.80GHz [Family 15 Model 1 Stepping 2][fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm up] Tuesday 15 May 2007 11:08:33 PM IST||Memory: 622.57 MB physical, 1.91 GB virtual Tuesday 15 May 2007 11:08:33 PM IST||Disk: 60.50 GB total, 47.17 GB free Tuesday 15 May 2007 11:08:33 PM IST|World Community Grid|URL: http://www.worldcommunitygrid.org/; Computer ID: 185430; location: (none); project prefs: default Tuesday 15 May 2007 11:08:33 PM IST||General prefs: from World Community Grid (last modified 2007-05-12 20:37:21) Tuesday 15 May 2007 11:08:33 PM IST||Host location: none Tuesday 15 May 2007 11:08:33 PM IST||General prefs: using your defaults Tuesday 15 May 2007 11:08:33 PM IST||Reading preferences override file Tuesday 15 May 2007 11:08:35 PM IST|World Community Grid|Restarting task lb862_00071_14 using hpf2 version 519 Tuesday 15 May 2007 11:58:46 PM IST|World Community Grid|[task_debug] task_state=QUIT_PENDING for lb862_00071_14 from preempt Tuesday 15 May 2007 11:58:48 PM IST|World Community Grid|[task_debug] Process for lb862_00071_14 exited Tuesday 15 May 2007 11:58:48 PM IST|World Community Grid|[task_debug] task_state=UNINITIALIZED for lb862_00071_14 from handle_exited_app Tuesday 15 May 2007 11:58:48 PM IST|World Community Grid|[task_debug] exit status 0 Tuesday 15 May 2007 11:58:57 PM IST||[task_debug] ACTIVE_TASK::start(): forked process: pid 15123 Tuesday 15 May 2007 11:58:57 PM IST|World Community Grid|[task_debug] task_state=EXECUTING for lb862_00071_14 from start Tuesday 15 May 2007 11:58:57 PM IST|World Community Grid|Restarting task lb862_00071_14 using hpf2 version 519 While the slots has only 2 slots slot 0 has an entry :- wcg_hpf2.last_pdb is this what is the last checkpoint or what is this? This is the last checkpoint which was done at 11 a.m. local time wcg_checkpoint_02.ckp Now its 1:05 am Thursday local time. Content of cc_config file <cc_config> <log_flags> <checkpoint_debug>1</checkpoint_debug> <task_debug>1</task_debug> </log_flags> </cc_config> Looking forward for help, suggestions, improvements on the same. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Each slot is used by a different Work Unit started / in actual progress. The Slot stores the progress files and (over) writes a new one each time a checkpoint is been saved. There are several tens of attempts packed in a HPF2 work unit each ending with a save.
----------------------------------------Here a brief explanation from the Unofficial BOINC Wiki http://boinc-wiki.ath.cx/index.php?title=Slots_Directory Resuming a job does not tell you which checkpoint. it merely tells from what progress point in CPU time and estimated percentage the computation picks up again. There is no prediction of a future checkpoint. Following the message log - only if the checkpoint logging was activated - and seeing a series with e.g. 25 minutes intervals would suggest that the next one would probably also happen in 25 minutes from the last. With non-deterministic calculation it's though very possible you see material fluctuation.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at May 15, 2007 8:07:00 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Each slot is used by a different Work Unit started / in actual progress. The Slot stores the progress files and (over) writes a new one each time a checkpoint is been saved. There are several tens of attempts packed in a HPF2 work unit each ending with a save. Here a brief explanation from the Unofficial BOINC Wiki http://boinc-wiki.ath.cx/index.php?title=Slots_Directory Resuming a job does not tell you which checkpoint. it merely tells from what progress point in CPU time and estimated percentage the computation picks up again. There is no prediction of a future checkpoint. Following the message log - only if the checkpoint logging was activated - and seeing a series with e.g. 25 minutes intervals would suggest that the next one would probably also happen in 25 minutes from the last. With non-deterministic calculation it's though very possible you see material fluctuation. Is it possible to have a user-defined checkpoint or not? Something like make a checkpoint every 30 mins or so or make a checkpoint every 1 hr. depending on the user's needs or is it determined by the project itself? |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hi shirish,
----------------------------------------Think you need to (re) read the checkpoint saving post in the Start Here forum. We can NOT influence when a checkpoint save is made. Sekerob
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello shirish,
----------------------------------------I wish we could checkpoint at regular intervals. But normally an application program will be using arrays with tens or even hundreds of megabytes of intermediate computations. The checkpoint code is inserted in the application program at points where (relatively) small amounts of data can capture the progress made. This varies from project to project. I believe that Genome Comparison offers many such points, so we checkpoint regularly. Just a matter of checking the clock when a potential checkpoint is reached. But most application programs are much more difficult and always checkpoint when they reach a suitable point - which may take a good deal of time to reach after the last point. There is some discussion of this in Start Here at http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=11332 Lawrence Added: I originally wrote that GC checkpoints at 10-minute intervals. Then I noticed that Sekerob said 20 minutes in his post. A search found a comment by Didactylos that said 10 minutes but did not find the original information that we were working from. Well, I will stick with 10 minutes. But I may be wrong. ![]() Added even later: 20 minute checkpoints for GC. Sekerob has proved his claim. ![]() Even later: Maybe GC checkpoints every 20 minutes in Italy but every 10 minutes elsewhere? Something peculiar is going on here. ![]() [Edit 3 times, last edit by Former Member at May 16, 2007 5:51:22 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
My information was empirical. It was true at the time, but Genome Comparison has had at least one upgrade since then, so things may have changed.
|
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
I'm mum, but to say that this WISIWYGT when activating the checkpoint logging feature from v 5.8.16
----------------------------------------![]() 2007-04-27 12:12:59 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 12:33:26 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 12:53:40 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 13:13:45 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 13:34:05 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 13:54:27 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 14:14:39 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed 2007-04-27 14:34:44 [World Community Grid] [checkpoint_debug] result 10000530-10001728_2 checkpointed
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
That sure looks like a 20 minute check point interval to me. I guess it changed during one of the upgrades. I'll try to remember that. ![]() |
||
|
|
retsof
Former Community Advisor USA Joined: Jul 31, 2005 Post Count: 6824 Status: Offline Project Badges:
|
It still depends on the computer. This one's doing about every 10.
----------------------------------------5/16/2007 10:34:23 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 10:44:23 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 10:54:25 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 11:04:27 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 11:14:29 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 11:24:30 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 11:34:38 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed 5/16/2007 11:44:43 AM|World Community Grid|[checkpoint_debug] result 10000344-10001872_0 checkpointed
SUPPORT ADVISOR
----------------------------------------Work+GPU i7 8700 12threads School i7 4770 8threads Default+GPU Ryzen 7 3700X 16threads Ryzen 7 3800X 16 threads Ryzen 9 3900X 24threads Home i7 3540M 4threads50% [Edit 1 times, last edit by retsof at May 16, 2007 5:22:24 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Now I'm curious. I think to remember that the perpetual writing to disk was moved into RAM, but the price being wider checkpoints. Would it be RAM size driven? Run 1 thread on 1.5gb but seeing 20 minutes on P4 and C2D.
----------------------------------------This is a question maybe Viktors could answer.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
|