| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 26
|
|
| Author |
|
|
niyar
Cruncher Joined: Mar 18, 2015 Post Count: 12 Status: Offline Project Badges:
|
The same problem happened last night. My tasks reset to 0% this morning, and my CPU time and CPU time since checkpoint last night were the same. Below is the log this morning for one of my tasks. Any further ideas? Thanks.
Application Microbiome Immunity Project 7.16 Name MIP1_00286685_7562 State Running Received 2020-03-29 8:31:38 PM Report deadline 2020-04-08 8:31:37 PM Estimated computation size 20,905 GFLOPs CPU time 00:00:01 CPU time since checkpoint 00:00:01 Elapsed time 00:00:11 Estimated time remaining 03:16:20 Fraction done 0.000% Virtual memory size 42.01 MB Working set size 40.20 MB Directory slots/2 Process ID 13720 Executable wcgrid_mip1_rosetta_7.16_windows_intelx86 |
||
|
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 328 Status: Offline Project Badges:
|
Have you tried hibernating your PC instead of shutting it down? BOINC should then restart without even going back to the previous checkpoint. As this is Windows you can only do this for a few days before Windows it ties itself in knots.
----------------------------------------If you look in the stderr file in the slot folder, there is a parameter nstruct which I believe is the number of structures within the work unit and hence the number of checkpoints which will be taken. [Edit 1 times, last edit by ca05065 at Apr 1, 2020 7:09:31 PM] |
||
|
|
niyar
Cruncher Joined: Mar 18, 2015 Post Count: 12 Status: Offline Project Badges:
|
Yes, I hibernated my computer last night instead of shutting it off, and the tasks resumed from where they had left off. Thanks for this idea. But is there a way around the problems I have been experiencing, other than doing this?
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Use BOINCTasks to suspend tasks when reaching a checkpoint then set Windows 10 functionality to shut down or hibernate when idle for N minutes.
https://www.faqforge.com/windows/windows-10/h...0-after-pc-has-been-idle/ |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
I wonder if a project reset would be worth trying?
----------------------------------------![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 328 Status: Offline Project Badges:
|
@niyar
It is quite rare for MIP to only process one structure in one work unit. They usually used to process 5 or 6 and I can remember up to 15. However I have not run MIP for a few months so it could have changed. If you did not find the 'nstruct' parameter in the stderr file within the corresponding slot folder, the contents of the stderr file can be seen after completion on the WCG website for about 2 days (if you are lucky). It is found by: main screen-> my contribution -> Results Status -> Status column and select the status (valid/pending validation/error) to see what was written as stderr during execution. |
||
|
|
niyar
Cruncher Joined: Mar 18, 2015 Post Count: 12 Status: Offline Project Badges:
|
Thanks. All of my 'Results Status' on the WCG webpage show my results over the past week as either "Valid" or "In progress." Nothing appears as 'pending' or 'error'. Also, in my Computing Preferences settings on my WCG software, the setting has always been set to "Request tasks to checkpoint at most every 60 seconds." So, even when I shut down my computer for the night, I don't see why the tasks would reset to 0% if they are supposed to checkpoint every 60 seconds.
Thanks. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
"AT MOST!" so if there is not check point, the clock starts ticking again and again. You can NOT force a checkpoint if the program is not ready with a simulation step.
----------------------------------------So to repeat the suggestion again. let your computer run until you've ascertained that a checkpoint has been written i.e. the time when the last checkpoint time is less then the total runtime. How this automatically done and then tasks suspending and computer closing is described in earlier post. [Edit 2 times, last edit by Former Member at Apr 3, 2020 5:14:03 PM] |
||
|
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 328 Status: Offline Project Badges:
|
If you click on the word 'valid' in the Status column the contents of the stderr file will be displayed and you should be able to see the nstruct value.
|
||
|
|
TOMinAZ
Cruncher United States Joined: Feb 11, 2007 Post Count: 40 Status: Offline Project Badges:
|
I'm seeing the same thing for the past few days.
----------------------------------------[Edit 1 times, last edit by TOMinAZ at Apr 4, 2020 9:33:05 PM] |
||
|
|
|