Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 30
Posts: 30   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7909 times and has 29 replies Next Thread
[VENETO] boboviz
Senior Cruncher
Joined: Aug 17, 2008
Post Count: 184
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Checkpoint? No checkpoint

I had to restart 2 times my machine and for 2 times i lost my calculation, restarting from 0%.
Not so good.
[Nov 3, 2019 8:39:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1403
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

Checkpoints for the African Rainfall Project are set at every 12.5% progress due to the limitations in the application.
Although was told 48 hours of data is processed in one task, in fact it are 8 main meteorological hours: 2 times 0000UTC, 0600UTC, 1200UTC and 1800UTC.
After the calculation of one main hour of data has finished, the application is able to write a checkpoint.
[Nov 3, 2019 9:50:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

It might take up to 3 hours per check point. Did you need to restart your machine? Could you have hibernated it instead? That way it holds what has been done?
Mike
[Nov 3, 2019 11:22:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

Some censored thoughts, but you may set <checkpoint_debug>1</checkpoint_debug> flag
in cc_config.xml to see them printed in the event log, every 12.5% progress equal to 6 hours of weather simulation.
[Nov 3, 2019 11:35:54 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Michael Goetz
Cruncher
United States
Joined: Dec 11, 2017
Post Count: 35
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

The infrequent checkpoints can be very inconvenient, but there's a workaround available if you want to do a little one-time setup work first. Sometimes you do have to reboot, for whatever reason. Stuff happens.

What I sometimes do with apps like this (hey, there's apps that NEVER are able to checkpoint) is run them inside a VM (virtual machine). If you installed the VBOX version of BOINC, you already have VBOX installed on your computer and you can use that. If not, you can install either VBOX or VMWare Player separately. Both of them are free.

If you need to reboot your (real) computer, just "save" the VM before rebooting. This is the equivalent of "hibernate" -- when you start the VM back up, it picks up exactly where it left off.
[Nov 3, 2019 11:45:06 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

My Intel Core i5 M520 @ 2.4 GHz has processed 8% in3 hours which eaquates to 4.5 hours per check point.
Mike
[Nov 3, 2019 12:37:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mike.Gibson
Ace Cruncher
England
Joined: Aug 23, 2007
Post Count: 12594
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

My first checkpoint was actually after 3 hours 55 minutes, so may be speeding up.

Mike
[Nov 3, 2019 2:53:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[VENETO] boboviz
Senior Cruncher
Joined: Aug 17, 2008
Post Count: 184
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

It might take up to 3 hours per check point. Did you need to restart your machine? Could you have hibernated it instead? That way it holds what has been done?


My first restart was after 6hs.
So, no, the checkpoint is not reliable.
[Nov 3, 2019 4:15:07 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[VENETO] boboviz
Senior Cruncher
Joined: Aug 17, 2008
Post Count: 184
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

What I sometimes do with apps like this (hey, there's apps that NEVER are able to checkpoint) is run them inside a VM (virtual machine). If you installed the VBOX version of BOINC, you already have VBOX installed on your computer and you can use that.

I'm crunching nanohub and lhc that use Virtualbox automaticaly (with their apps), to solve this problem.
So, i don't want to create a vm for WCG
[Nov 3, 2019 4:19:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
rbotterb
Senior Cruncher
United States
Joined: Jul 21, 2005
Post Count: 401
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Checkpoint? No checkpoint

Well I restarted my laptop after four hours to see what happened to the APU WU I was running. Went back to zero. So I guess I'll have to plan on running these WUs only on M-F when I happen to be on for a longer periods of time, otherwise there is no way I'll get any work done. Such is life. This WU I'm running is already a _1, so I"m thinking there are more than a few of these WUs which will not run to completion in a 7 day period. Time will tell if enough reruns of work get done before some extra programming gets done to tighten up the checkpoints. I realize some of this is a resource issue for programming and possibly financial for the project, so we'll just all have to deal with it as best we can going forward.
[Nov 3, 2019 7:08:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 30   Pages: 3   [ 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread