Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Retired Forums Forum: UD Windows Agent Support [Read Only] Thread: Isn't there a way of increasing the frequency of project saves by the user? |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 13
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Whenever I shut down my PC I lose project time. For three days I was always starting over at 40% or so, for example, because the UD client simply does not save the project's work often enough. Isn't there a way of increasing the save frequency? It has become counterproductive to continue with WCG in this way.
BZ. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Perhaps until this problem is fixed, you might change your project preferences to only include the HPF project. HPF uses Rosetta, and I understand it checkpoints much, much more frequently than AutoDock.
That should fix the problem until (if) it is possible to increase the AutoDock checkpointing frequency. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Check points are part of the science application, not the client. AutoDock has relatively few check points. It checkpoints when the green line in the Application View reaches the edge of the screen and starts redrawing. HPF check points whenever the Progress indicator increments several tenths of a per cent.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
mycroft...
I am tempted to recommend to bzag0 that he/she use the UDMonitor software, but I am not sure if that solution would provide help more than the (new) problems that may be introduced. For one, I am not aware of a way, if any, to effect in UDMonitor to save at user-specified intervals. A user may settle for whatever may turn out as the save time interval and from there, using the UDMonitor software, do a shutdown. Thus, hovering the mouse to the UDMonitor icon on the SysTray, and right-click, go: After next save > Shutdown. As for the save interval times using UDMonitor, I noticed that broadly, the longer the completed runtime, the shorter the save interval time. Thus, a work unit completed after 10hrs typically saves after every 6 or so minutes while a work unit completed after 4 hours typically saves after every 1 hour or so. Your ideas on this and perhaps we can next recommend this to bzag0 as an interim solution. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi andzgrid,
The problem is that the application program is normally in a state with tens of megabytes of floating point arrays in use. Every once in a while it converges to a solution, and then a small checkpoint can be saved. Then it starts another round of computations. Unlike Rosetta, AutoDock just does not seem to have many natural checkpoints. Rick Alther looked into this in November and gave us an updated version of AutoDock in January, but he was not able to solve the dearth of checkpoints. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Perhaps... with a little more development effort, it might be possible to save state (at a large diskspace cost) when the application is closed down normally. Obviously, it's not a hugely high priority issue. Is AutoDock open source?
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
lawrencehardin...
I could not have said it better myself... ... just kidding. Seriously now… I don’t know but is there something we can learn from looking at how Windows does hibernation? My gut feel tells me that there may be something we can start off from there, some variation thereof. And how does Rosetta get to have (comparatively many) ‘natural’ checkpoints than Autodock? Dearth of checkpoints… Hmmm… This is a wild shot, but here goes: Artificial checkpoints? |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Rosetta has many points where the current state can be saved in a fraction of a megabyte. AutoDock has a few points like that. I suspect that AutoDock would normally need to write 40-70 MB to save its state. Rick Alther looked at the program and could not find more points to make into small check points.
|
||
|
Alther
Former World Community Grid Tech United States of America Joined: Sep 30, 2004 Post Count: 414 Status: Offline Project Badges: |
The problem with checkpointing isn't identifying and writing all the data to disk. The hard part about checkpointing is the restoration process. Figuring out how to read in the data and set up the data structures at the proper points so that the typically stochastic algorithms resume precisely where they left off. These are very complex calculations with millions of data points where data in one part of the calculation affects other parts later on and so forth.
----------------------------------------We choose a location (or locations) in the program where the state is easily known, and more importantly, is easy to resume should we need to do so. This is on our plate to look into, but I have a five course meal of tasks sitting in front of me To answer your question about using something similar to Windows hibernation, yes, there are grid systems which use a technique similar to this. There are grid systems where the client application runs inside a virtual machine (think VM Ware). In that case, there is no need for checkpointing since the grid agent can simply take a snapshot of the virtual machine state and resume right from there. The downside to this is it requires virtual machine software on every client and running a virtual machine adds to the RAM/VM overhead. While this is currently impractical for a volunteer grid of devices, it may not be so in the future.
Rick Alther
Former World Community Grid Developer |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Alther...
The downside to this is it requires virtual machine software on every client and running a virtual machine adds to the RAM/VM overhead. Not to overlook the performance parameter as another item in the overhead, which is probably the killer here in as much as the other parameter -- memory -- is of little impact considering that gig-sized RAMs and near terabyte-sized hard disks are approaching mainstream. So, it looks like we are going to have to do with whatever checkpoints opportunity that may be thrown our way, and hope a 'lucky protein' which affords many checkpoints, gets thrown instead. I'll be around, crunching. |
||
|
|