| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 30
|
|
| Author |
|
|
[VENETO] boboviz
Senior Cruncher Joined: Aug 17, 2008 Post Count: 184 Status: Offline Project Badges:
|
I had to restart 2 times my machine and for 2 times i lost my calculation, restarting from 0%.
Not so good. |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1403 Status: Offline Project Badges:
|
Checkpoints for the African Rainfall Project are set at every 12.5% progress due to the limitations in the application.
Although was told 48 hours of data is processed in one task, in fact it are 8 main meteorological hours: 2 times 0000UTC, 0600UTC, 1200UTC and 1800UTC. After the calculation of one main hour of data has finished, the application is able to write a checkpoint. |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
It might take up to 3 hours per check point. Did you need to restart your machine? Could you have hibernated it instead? That way it holds what has been done?
Mike |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Some censored thoughts, but you may set <checkpoint_debug>1</checkpoint_debug> flag
in cc_config.xml to see them printed in the event log, every 12.5% progress equal to 6 hours of weather simulation. |
||
|
|
Michael Goetz
Cruncher United States Joined: Dec 11, 2017 Post Count: 35 Status: Offline Project Badges:
|
The infrequent checkpoints can be very inconvenient, but there's a workaround available if you want to do a little one-time setup work first. Sometimes you do have to reboot, for whatever reason. Stuff happens.
What I sometimes do with apps like this (hey, there's apps that NEVER are able to checkpoint) is run them inside a VM (virtual machine). If you installed the VBOX version of BOINC, you already have VBOX installed on your computer and you can use that. If not, you can install either VBOX or VMWare Player separately. Both of them are free. If you need to reboot your (real) computer, just "save" the VM before rebooting. This is the equivalent of "hibernate" -- when you start the VM back up, it picks up exactly where it left off. |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
My Intel Core i5 M520 @ 2.4 GHz has processed 8% in3 hours which eaquates to 4.5 hours per check point.
Mike |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
My first checkpoint was actually after 3 hours 55 minutes, so may be speeding up.
Mike |
||
|
|
[VENETO] boboviz
Senior Cruncher Joined: Aug 17, 2008 Post Count: 184 Status: Offline Project Badges:
|
It might take up to 3 hours per check point. Did you need to restart your machine? Could you have hibernated it instead? That way it holds what has been done? My first restart was after 6hs. So, no, the checkpoint is not reliable. |
||
|
|
[VENETO] boboviz
Senior Cruncher Joined: Aug 17, 2008 Post Count: 184 Status: Offline Project Badges:
|
What I sometimes do with apps like this (hey, there's apps that NEVER are able to checkpoint) is run them inside a VM (virtual machine). If you installed the VBOX version of BOINC, you already have VBOX installed on your computer and you can use that. I'm crunching nanohub and lhc that use Virtualbox automaticaly (with their apps), to solve this problem. So, i don't want to create a vm for WCG |
||
|
|
rbotterb
Senior Cruncher United States Joined: Jul 21, 2005 Post Count: 401 Status: Offline Project Badges:
|
Well I restarted my laptop after four hours to see what happened to the APU WU I was running. Went back to zero. So I guess I'll have to plan on running these WUs only on M-F when I happen to be on for a longer periods of time, otherwise there is no way I'll get any work done. Such is life. This WU I'm running is already a _1, so I"m thinking there are more than a few of these WUs which will not run to completion in a 7 day period. Time will tell if enough reruns of work get done before some extra programming gets done to tighten up the checkpoints. I realize some of this is a resource issue for programming and possibly financial for the project, so we'll just all have to deal with it as best we can going forward.
|
||
|
|
|