Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 99
Posts: 99   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 354305 times and has 98 replies Next Thread
vepaul
Senior Cruncher
Belgium
Joined: Nov 17, 2004
Post Count: 261
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

Mine are mostly OK:
BETA_ OET1_ 0000298_ xEBGP-FA_ rig_ 1531_ 1-- Bureau2-HP Validation en attente 7/01/15 23:11:46 8/01/15 03:21:42 1,34 / 1,34 37,1 / 0,0
BETA_ OET1_ 0000298_ xEBGP-FA_ rig_ 0023_ 1-- Bureau2-HP Valide 7/01/15 23:09:37 8/01/15 03:21:42 1,86 / 1,87 51,6 / 58,2
BETA_ OET1_ 0000297_ xZAGP_ 0138_ 0-- Bureau2-HP Valide 7/01/15 22:56:29 8/01/15 03:21:42 1,81 / 1,82 50,3 / 39,2
BETA_ OET1_ 0000297_ xZAGP_ 0839_ 0-- Bureau2-HP Valide 7/01/15 22:51:21 8/01/15 03:21:42 2,29 / 2,30 63,5 / 49,4
BETA_ OET1_ 0000296_ xZAGP_ 0548_ 0-- Bureau2-HP Validation en attente 7/01/15 22:44:14 8/01/15 03:21:42 2,01 / 2,02 55,9 / 0,0
BETA_ OET1_ 0000296_ xZAGP_ 1106_ 1-- paul-HP2 Valide 7/01/15 22:41:27 8/01/15 02:41:50 2,16 / 2,19 64,8 / 60,2
BETA_ OET1_ 0000296_ xZAGP_ 0019_ 1-- paul-HP2 Validation en attente 7/01/15 22:39:19 8/01/15 02:41:50 1,95 / 1,97 58,4 / 0,0
BETA_ OET1_ 0000295_ xZAGP_ 0023_ 1-- paul-HP2 Valide 7/01/15 22:09:50 8/01/15 02:41:50 2,32 / 2,34 69,5 / 67,8
[Jan 8, 2015 2:38:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

One last one of 296 was at 1:23 hours with indication that previous checkpoint was 5 minutes prior. After suspending and resuming the CPU time fell back to 1:18, which computes correctly then accumulated time properly.
[Jan 8, 2015 3:16:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Falconet
Master Cruncher
Portugal
Joined: Mar 9, 2009
Post Count: 3315
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

My 298 task kept running and increasing the progress percentage to about 23% last time I remember. No it has gone back to 20,000% and it hasn't checkpointed in over 45 minutes of CPU time.
----------------------------------------


- AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W
- AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W
- AMD Ryzen 7 7730U 8C/16T 3.0 GHz
[Jan 8, 2015 4:02:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Falconet
Master Cruncher
Portugal
Joined: Mar 9, 2009
Post Count: 3315
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

Okay it made a checkpoint at 45 minutes and 31 seconds CPU time.
Percentage is still the same.

I suspended it without LAIM and the percentage is the same but the CPU time resumed from checkpoint, so I guess it could be a good sign.
----------------------------------------


- AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W
- AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W
- AMD Ryzen 7 7730U 8C/16T 3.0 GHz
[Jan 8, 2015 4:11:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

I had one 00298 work unit.

After the sequence LAIM off, suspend, removed from memory message, resume, running I noticed the following:
Properties showed CPU last checkpoint as i hour 48 minutes but the stderr file shows zero CPU time at restart:

Result Log

Result Name: BETA_ OET1_ 0000298_ xEBGP-FA_ rig_ 1327_ 1--
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[23:41:20] Number of tasks = 1
[23:41:20] Starting task 0,CPU time is 0.000000
[23:41:20] ./ZINC13130211_1.pdbqt size = 24 4 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0
[00:18:01] Number of tasks = 1
[00:18:01] Starting task 0,CPU time is 0.000000
[00:18:01] ./ZINC13130211_1.pdbqt size = 24 4 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0
[10:34:12] Number of tasks = 1
[10:34:12] Starting task 0,CPU time is 0.000000
[10:34:12] ./ZINC13130211_1.pdbqt size = 24 4 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0
[11:01:36] Finished task #0 cpu time used 8109.087677
11:01:36 (192716): called boinc_finish

Note that the CPU time changes from 0 to 8109 seconds in 27 minutes (1620 seconds).


Basically this looks like a case of bad information on the stderr. The Starting task always starts with 0.00000 on task 0. But as you can see it did report the proper final cpu time.

Thanks,
-Uplinger
[Jan 8, 2015 4:27:39 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Yarensc
Advanced Cruncher
USA
Joined: Sep 24, 2011
Post Count: 136
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

I got 4 ridged ones from batch 296 split between two machines, they all checkpointed frequently and resumed correctly after suspending with LAIM off. Although looking at the log afterwords (through the results status page) there wasn't any indication that a rollback happened.
[Jan 8, 2015 4:47:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Falconet
Master Cruncher
Portugal
Joined: Mar 9, 2009
Post Count: 3315
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

It checkpointed at 1 hour and 31 minutes and increased to 30% at 1 hour and 38 or minutes.

Checkpoints are just too far apart.
----------------------------------------


- AMD Ryzen 5 1600AF 6C/12T 3.2 GHz - 85W
- AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W
- AMD Ryzen 7 7730U 8C/16T 3.0 GHz
----------------------------------------
[Edit 2 times, last edit by Falconet at Jan 8, 2015 5:14:15 PM]
[Jan 8, 2015 5:07:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

... they all checkpointed frequently and resumed correctly after suspending with LAIM off. Although looking at the log afterwords (through the results status page) there wasn't any indication that a rollback happened.
Yarensc, there should be an indication of a restart, but it's not easy to spot. Here's an excerpt from one of my Result Logs. See the 2 instances of "Starting task 12,CPU time is..." and the additional "Number of tasks = ..." - they are the key.

[22:51:31] Finished task #11 cpu time used 311.643198
[22:51:31] Starting task 12,CPU time is 2246.976004
[22:51:31] ./ZINC11534746_1.pdbqt size = 31 7 ../../projects/www.worldcommunitygrid.org/beta20.xZAGP.pdbqt size = 2321 0
[22:52:52] Number of tasks = 38
[22:52:52] Starting task 12,CPU time is 2246.976004
[22:52:52] ./ZINC11534746_1.pdbqt size = 31 7 ../../projects/www.worldcommunitygrid.org/beta20.xZAGP.pdbqt size = 2321 0
[22:54:48] Finished task #12 cpu time used 147.014904

[Jan 8, 2015 5:37:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Yarensc
Advanced Cruncher
USA
Joined: Sep 24, 2011
Post Count: 136
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

Ahh thanks Tony, I see that now. I was Looking for something like 'restarting from x time'
[Jan 8, 2015 10:36:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
deltavee
Ace Cruncher
Texas Hill Country
Joined: Nov 17, 2004
Post Count: 4894
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.14 - Jan 7, 2015 [ Issues Thread ]

I am seeing more beta workunits.

edit- these are series 299 and 308.
----------------------------------------
[Edit 1 times, last edit by deltavee at Jan 8, 2015 10:50:09 PM]
[Jan 8, 2015 10:38:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 99   Pages: 10   [ Previous Page | 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread