Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: OpenPandemics - COVID 19 Beta Test April 20, 2020 [ Issues Thread ] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 303
|
Author |
|
BladeD
Ace Cruncher USA Joined: Nov 17, 2004 Post Count: 28976 Status: Offline Project Badges: |
There is beta in the feeder now. Thanks, -Uplinger I would say so! Six running on my main PC at the moment! |
||
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2068 Status: Offline Project Badges: |
I got 9 of them, one already done. No problems that I can see.
|
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Just a little bit over half a batch has been sent out so far. Should be 4 total batches. So plenty to grab if you're hunting for them :)
Remember to try to force them to restart and restore from check point. Thanks again for everyone helping test! -Uplinger |
||
|
Speedy51
Veteran Cruncher New Zealand Joined: Nov 4, 2005 Post Count: 1259 Status: Offline Project Badges: |
Thanks uplinger, I gather it is normal behaviour for all of the checkpoint to have the same timestamp when a new checkpoint is written? I gather if a docking has completed within less then 10 minutes (what my preferences set to) a checkpoint will not be written until after 10 minutes has passed?
---------------------------------------- |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Thanks uplinger, I gather it is normal behaviour for all of the checkpoint to have the same timestamp when a new checkpoint is written? I gather if a docking has completed within less then 10 minutes (what my preferences set to) a checkpoint will not be written until after 10 minutes has passed? We have a set of things that are needing to be saved to properly restore from check point. These are temporary files and should be all written about the same time during a check point. Most of the flies you have check pointed are the map files generated during simulation. The way BOINC handles a check point is that when we get to a point in the code (in this case the end of a docking), we check with BOINC if it's ok to check point. If it is, then we write the files for the check point. A more detailed response for this is here: https://boinc.berkeley.edu/trac/wiki/BasicApi#checkpointing Thanks, -Uplinger |
||
|
Aurum
Master Cruncher The Great Basin Joined: Dec 24, 2017 Post Count: 2384 Status: Offline Project Badges: |
I suspended and restarted the 3 betas I've gotten so far and they all restarted very close to the elapsed time they were suspended. Exactly what should we do to "force them to restart?" Exit and relaunch BOINC or will Task Suspend Task Resume suffice?
----------------------------------------...KRI please cancel all shadow-banning |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
There are 3 ways to fully force a restart:
----------------------------------------1. BOINC needs to have the application removed from memory when application is suspended via the BOINC client. This is a setting you can set on the website or in the client. (You should be able to see it removed from your task manager or ps) 2. You can exit BOINC. This means you need to check for the application boinc.exe isn't still running on your computer. BOINC consists of two main parts, boinc client (boinc.exe) and the manager (boincmgr.exe) that gives you the GUI interface. Just closing the manager doesn't close the client that controls the science applications. 3. A sure fire way to test is to *gasps* reboot the machine running BOINC. This will make sure everything is fully out of memory. Thank you for the help in testing! -Uplinger [edit rewritten to make more sense -Uplinger] [Edit 1 times, last edit by uplinger at May 6, 2020 11:29:49 PM] |
||
|
Aurum
Master Cruncher The Great Basin Joined: Dec 24, 2017 Post Count: 2384 Status: Offline Project Badges: |
Thx for the quick reply. I had the "leave task in memory when suspended" checked on all my computers. How many minutes should I let them run before this test?
----------------------------------------And I'll gladly reboot without even gasping :-) ...KRI please cancel all shadow-banning [Edit 1 times, last edit by Aurum420 at May 6, 2020 11:25:30 PM] |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Thx for the quick reply. I had the "leave task in memory when suspended" checked on all my computers. How many minutes should I let them run before this test? And I'll gladly reboot without even gasping :-) Made my post a little bit more readable for others following the thread above. A majority of the results I have seen by manually inspecting them have atleast 20 dockings. (Some have quite a bit more, I saw some with over 5000 for a 3 hour work unit). With 20 being the low end, you should be safe to force a restart after 30 minutes. This would be past the max that BOINC has check pointing throttles (17 minutes) as mentioned previously by lavaflow. Also, a single forced restart is plenty, but if you want to give our validation tools more work, then forcing a restart more than once in a run is great also. Thanks, -Uplinger |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
Aurum420,
A few more things. Having the application left in memory normally is recommended by me. This way, if the work unit is suspended for a few minutes, you don't lose computational time and it starts back up without reading the check point files as everything is already loaded in memory. We have fully sent out batch 333. batch 334 is starting sends now. Thanks, -Uplinger |
||
|
|