| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 23
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
ive jst started this project, use to do others
but are you caying that there isnt a set length of time for a save point. and there could be a 5 hour gap between a save point then thats stupid and should be like others with set times or set percentages. evon one you could set it to save its self every 3 mins for slow unrelyable pc's |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello matthewkpitts,
Traditionally, a check point does not consist of a snapshot of the process (many hundreds of megabytes), but just a small amount of data to be reloaded with an address to start at. The vast explosion of storage space may change that, but for the moment we have to look for a place in the program where only a few variables and arrays need to be stored. But the programs that have been handed to us to board for grid projects have very few such points. Lawrence |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
I'm not sure its a trend, but from my post a bit up, i got again as has also been reported by others
----------------------------------------I'M SEEING THE SAME NOW BY COINCIDENCE ON THE 'i' SCREEN OF A FAAH ON UD. progresses slowly from 47.7 to 50.0%, than skips back to 47.7. Now cycled thru that 4 times in the last 2 hours......unless getting a quick reply, i'm sending this one to eternal hunting grounds, in the prescribed manner.DEVICE IS 160465, FAAH v 4.0.3.4, WU download approx. 01:21am UTC This time i killed it 01:41 UTC on 7.20.06. This afternoon (7.19.06) i properly 'snoozed' this WU, properly closed the UD Agent by right-clicking and exit, so that i could boot...when it resumed it started off right when the initial docking was done about 20/25 minutes into the crunch, so lost there 5 hours, plus that it went around since from about 4.5 hours thru 12.5 hours clock time. The UD program dir still got a copy of that original crunch file of 29mb, if interested? thx
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Sekerob,
Well, I finally got one of those FAAH units on my UD client. The check point was at 50% and it took over an hour of failed attempts to get from 47% to 50%. The progress and graph kept falling back when the program realized that an attempt was failing. So we are working through a group of work units that are unusually tough. But AutoDock did it without any errors or endless loops, unlike HPF2. So, while it is frustrating not to have lots of check points, just try to let the computer run as long as possible. And set the throttle as close to 100% as you dare. Lawrence |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hi Lawrence,
----------------------------------------can you confirm then, that cycling from 47.7 to 50.0% and back and up, eventually made your copy go past that 50.0 hurdle....but in 1 hour or so, and not the 8 hours my copy used or worse as has been advised on in posts here above? So how often should it try, as it takes about 20 minutes to do this 2.3% cycle on my machine.....dont see the line graph rescaling, just see retreating of the green line....the left hand initial descend of the curve remains static.......meantime its sodding evil, that also as reported by others, the WU returns to this post-init 0.0% state on proper pausing/closing.....should come with a 'do-not-interrupt...only good for 24/7 zero boot machines" warning. Wikkel reported 4 hours before passing the 50.0 hurdle, so what's the maximum wait....science cant be pushed, but how often does one inject the patient before it dies? It's curious its on UD agent.....never observed on BOINC. UD crediting in clock time, irrespective however long it was 'allowed' to run with accumulated checkpoints....seemingly not stored on this particular hefty batch.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Sekerob,
We can see it happening on UD because the graphics give us a better idea of what is going on. The same work unit on BOINC would also reset to 0 at exit, but the user would likely shrug or even be unaware. Rick Alther posted about this situation months ago. Unlike HPF2, patience tends to be rewarded by FAAH. But it is an unusual situation. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
because of having practally no save points ive lost over 10 hours of work. i cant see why they dont just put check points in like every other one like this i have done. most people have over 10 gbs of hdd space so space isnt a factor. it realy anoys me to think that all my work is wasted andalmost make me want to look for an other project.
advice make check points every 15 mins so not alot of work is wasted. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
so far i havent had a save, and my pc was prosessing for over 7 hours in that 1 section
|
||
|
|
Alther
Former World Community Grid Tech United States of America Joined: Sep 30, 2004 Post Count: 414 Status: Offline Project Badges:
|
Hi Sekerob, Well, I finally got one of those FAAH units on my UD client. The check point was at 50% and it took over an hour of failed attempts to get from 47% to 50%. The progress and graph kept falling back when the program realized that an attempt was failing. So we are working through a group of work units that are unusually tough. But AutoDock did it without any errors or endless loops, unlike HPF2. So, while it is frustrating not to have lots of check points, just try to let the computer run as long as possible. And set the throttle as close to 100% as you dare. Lawrence Yes, this is an unusual workunit. I myself received it last night and watched it oscillate for a while. The docking eventually completed and continued on to the next docking. There is nothing wrong with the workunit other than it's very difficult. But that's the whole point of the grid: to use this power to solve the difficult problems. Not everything is simple and predictable. The oscillation of the percent is explained by the fact that we make a guess as to how long a docking will take. If the docking is not finished as predicted, we recalculate the docking length..and so on. Every recalculation causes the percent to drop a little bit until it passes (which it will eventually). The opposite is also true: many times a docking finishes before the estimate and the percent jumps up....but you don't hear people complaining about that It's good to be vigilant for strange behavior, but people also need to be a bit more patient.
Rick Alther
Former World Community Grid Developer |
||
|
|
Alther
Former World Community Grid Tech United States of America Joined: Sep 30, 2004 Post Count: 414 Status: Offline Project Badges:
|
because of having practally no save points ive lost over 10 hours of work. i cant see why they dont just put check points in like every other one like this i have done. most people have over 10 gbs of hdd space so space isnt a factor. it realy anoys me to think that all my work is wasted andalmost make me want to look for an other project. advice make check points every 15 mins so not alot of work is wasted. Easier said than done. It's not a matter of hard disk space. The difficultly lies in restoring the calculation to exactly the same state it was when it last checkpointed, otherwise the result will differ. You can't just checkpoint any place/time you want. We can only checkpoint in a few area where restoring the calculation properly can be done. Every project is different and each algorithm lends itself to checkpointing at different intervals. This WU checkpoints at the same spot as every other FA@H WU. It's just that this one is much more difficult. No one said every workunit would be easy. We must crunch through the difficult ones as well as the easy ones.
Rick Alther
Former World Community Grid Developer |
||
|
|
|