| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 99
|
|
| Author |
|
|
vepaul
Senior Cruncher Belgium Joined: Nov 17, 2004 Post Count: 261 Status: Offline Project Badges:
|
Mine are mostly OK:
BETA_ OET1_ 0000298_ xEBGP-FA_ rig_ 1531_ 1-- Bureau2-HP Validation en attente 7/01/15 23:11:46 8/01/15 03:21:42 1,34 / 1,34 37,1 / 0,0 BETA_ OET1_ 0000298_ xEBGP-FA_ rig_ 0023_ 1-- Bureau2-HP Valide 7/01/15 23:09:37 8/01/15 03:21:42 1,86 / 1,87 51,6 / 58,2 BETA_ OET1_ 0000297_ xZAGP_ 0138_ 0-- Bureau2-HP Valide 7/01/15 22:56:29 8/01/15 03:21:42 1,81 / 1,82 50,3 / 39,2 BETA_ OET1_ 0000297_ xZAGP_ 0839_ 0-- Bureau2-HP Valide 7/01/15 22:51:21 8/01/15 03:21:42 2,29 / 2,30 63,5 / 49,4 BETA_ OET1_ 0000296_ xZAGP_ 0548_ 0-- Bureau2-HP Validation en attente 7/01/15 22:44:14 8/01/15 03:21:42 2,01 / 2,02 55,9 / 0,0 BETA_ OET1_ 0000296_ xZAGP_ 1106_ 1-- paul-HP2 Valide 7/01/15 22:41:27 8/01/15 02:41:50 2,16 / 2,19 64,8 / 60,2 BETA_ OET1_ 0000296_ xZAGP_ 0019_ 1-- paul-HP2 Validation en attente 7/01/15 22:39:19 8/01/15 02:41:50 1,95 / 1,97 58,4 / 0,0 BETA_ OET1_ 0000295_ xZAGP_ 0023_ 1-- paul-HP2 Valide 7/01/15 22:09:50 8/01/15 02:41:50 2,32 / 2,34 69,5 / 67,8 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
One last one of 296 was at 1:23 hours with indication that previous checkpoint was 5 minutes prior. After suspending and resuming the CPU time fell back to 1:18, which computes correctly then accumulated time properly.
|
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
My 298 task kept running and increasing the progress percentage to about 23% last time I remember. No it has gone back to 20,000% and it hasn't checkpointed in over 45 minutes of CPU time.
----------------------------------------![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
Okay it made a checkpoint at 45 minutes and 31 seconds CPU time.
----------------------------------------Percentage is still the same. I suspended it without LAIM and the percentage is the same but the CPU time resumed from checkpoint, so I guess it could be a good sign. ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz |
||
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
I had one 00298 work unit. After the sequence LAIM off, suspend, removed from memory message, resume, running I noticed the following: Properties showed CPU last checkpoint as i hour 48 minutes but the stderr file shows zero CPU time at restart: Result Log Result Name: BETA_ OET1_ 0000298_ xEBGP-FA_ rig_ 1327_ 1-- <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. [23:41:20] Number of tasks = 1 [23:41:20] Starting task 0,CPU time is 0.000000 [23:41:20] ./ZINC13130211_1.pdbqt size = 24 4 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0 [00:18:01] Number of tasks = 1 [00:18:01] Starting task 0,CPU time is 0.000000 [00:18:01] ./ZINC13130211_1.pdbqt size = 24 4 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0 [10:34:12] Number of tasks = 1 [10:34:12] Starting task 0,CPU time is 0.000000 [10:34:12] ./ZINC13130211_1.pdbqt size = 24 4 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0 [11:01:36] Finished task #0 cpu time used 8109.087677 11:01:36 (192716): called boinc_finish Note that the CPU time changes from 0 to 8109 seconds in 27 minutes (1620 seconds). Basically this looks like a case of bad information on the stderr. The Starting task always starts with 0.00000 on task 0. But as you can see it did report the proper final cpu time. Thanks, -Uplinger |
||
|
|
Yarensc
Advanced Cruncher USA Joined: Sep 24, 2011 Post Count: 136 Status: Offline Project Badges:
|
I got 4 ridged ones from batch 296 split between two machines, they all checkpointed frequently and resumed correctly after suspending with LAIM off. Although looking at the log afterwords (through the results status page) there wasn't any indication that a rollback happened.
|
||
|
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3315 Status: Offline Project Badges:
|
It checkpointed at 1 hour and 31 minutes and increased to 30% at 1 hour and 38 or minutes.
----------------------------------------Checkpoints are just too far apart. ![]() - AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W - AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W - AMD Ryzen 7 7730U 8C/16T 3.0 GHz [Edit 2 times, last edit by Falconet at Jan 8, 2015 5:14:15 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
... they all checkpointed frequently and resumed correctly after suspending with LAIM off. Although looking at the log afterwords (through the results status page) there wasn't any indication that a rollback happened. Yarensc, there should be an indication of a restart, but it's not easy to spot. Here's an excerpt from one of my Result Logs. See the 2 instances of "Starting task 12,CPU time is..." and the additional "Number of tasks = ..." - they are the key.[22:51:31] Finished task #11 cpu time used 311.643198 [22:51:31] Starting task 12,CPU time is 2246.976004 [22:51:31] ./ZINC11534746_1.pdbqt size = 31 7 ../../projects/www.worldcommunitygrid.org/beta20.xZAGP.pdbqt size = 2321 0 [22:52:52] Number of tasks = 38 [22:52:52] Starting task 12,CPU time is 2246.976004 [22:52:52] ./ZINC11534746_1.pdbqt size = 31 7 ../../projects/www.worldcommunitygrid.org/beta20.xZAGP.pdbqt size = 2321 0 [22:54:48] Finished task #12 cpu time used 147.014904 |
||
|
|
Yarensc
Advanced Cruncher USA Joined: Sep 24, 2011 Post Count: 136 Status: Offline Project Badges:
|
Ahh thanks Tony, I see that now. I was Looking for something like 'restarting from x time'
|
||
|
|
deltavee
Ace Cruncher Texas Hill Country Joined: Nov 17, 2004 Post Count: 4894 Status: Offline Project Badges:
|
I am seeing more beta workunits.
----------------------------------------edit- these are series 299 and 308. [Edit 1 times, last edit by deltavee at Jan 8, 2015 10:50:09 PM] |
||
|
|
|