Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 36
Posts: 36   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 10812 times and has 35 replies Next Thread
KWSN-A Shrubbery
Senior Cruncher
Joined: Jan 8, 2006
Post Count: 476
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

Managed to catch this run on lunch. I was promised punch and pie, or at least early next week.

Glad to get beta as always, hopefully we're gonna see production soon.
----------------------------------------

[Feb 6, 2015 9:44:16 PM]   Link   Report threatening or abusive post: please login first  Go to top 
tmedve
Senior Cruncher
USA
Joined: Nov 16, 2004
Post Count: 191
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

The time remaining on batch 322 behaves strangely. It fairly rapidly decreases to from the about 27 minutes down to 3 or 4 minutes at about 15% - 20% and then start climbing up slowly. Units are still checkpointing about every 10 minutes. Based on the percentages, these look like they may run for about 2 hours.
----------------------------------------

----------------------------------------
[Edit 2 times, last edit by tmedve at Feb 6, 2015 10:02:06 PM]
[Feb 6, 2015 9:58:23 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN-A Shrubbery
Senior Cruncher
Joined: Jan 8, 2006
Post Count: 476
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

Couple units from batch 322 are showing no remaining time at 68%. CPU counter is still incrementing.
----------------------------------------

[Feb 6, 2015 10:00:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mamajuanauk
Master Cruncher
United Kingdom
Joined: Dec 15, 2012
Post Count: 1900
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

BETA_ OET1_ 0000322_ xEBGP-FA_ rig_ 11014_ 0--

This work unit has done nearly 23 minutes elapsed about the same CPU time, no checkpoint and 00:00% progress

It's not the only one I'm seeing.

Ubuntu 12.05 LTS server version with quad socket/processor
----------------------------------------
Mamajuanauk is the Name! Crunching is the Game!



[Feb 6, 2015 10:11:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

I've suspended and restarted a few units with LAIM off and one of those has turned Valid. Checkpoints occurred about every 6 minutes. However, the Result Log still shows the restarted CPU time as being 0. Is that intended?

Result Name: BETA_ OET1_ 0000319_ xSDGP-S_ rig_ 1684_ 0--
<core_client_version>7.2.47</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[20:27:08] Number of tasks = 1
[20:27:08] Running task 0,CPU time at start of task 0 was 0.000000
[20:27:08] ./ZINC18056732_1.pdbqt size = 23 5 ../../projects/www.worldcommunitygrid.org/beta20.xSDGP-S_rig.pdbqt size = 2428 0
[21:15:25] Number of tasks = 1
[21:15:25] Running task 0,CPU time at start of task 0 was 0.000000
[21:15:25] ./ZINC18056732_1.pdbqt size = 23 5 ../../projects/www.worldcommunitygrid.org/beta20.xSDGP-S_rig.pdbqt size = 2428 0
[21:21:39] Finished task #0 cpu time used 2785.663246
21:21:39 (137968): called boinc_finish(0)

The checkpoint prior to suspend/restart:
06/02/2015 21:09:40 | World Community Grid | [checkpoint] result BETA_OET1_0000319_xSDGP-S_rig_1684_0 checkpointed

Workunit Status:
BETA_ OET1_ 0000319_ xSDGP-S_ rig_ 1684_ 1-- 719 Valid 06/02/15 20:27:17 06/02/15 21:24:07 0.68 20.2 / 23.6
BETA_ OET1_ 0000319_ xSDGP-S_ rig_ 1684_ 0-- 719 Valid 06/02/15 20:26:51 06/02/15 21:21:51 0.77 27.0 / 23.6
[Feb 6, 2015 10:15:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
genhos
Veteran Cruncher
UK
Joined: Apr 26, 2009
Post Count: 1108
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

BETA_ OET1_ 0000322_ xEBGP-FA_ rig_ 11014_ 0--

This work unit has done nearly 23 minutes elapsed about the same CPU time, no checkpoint and 00:00% progress

It's not the only one I'm seeing.

Ubuntu 12.05 LTS server version with quad socket/processor

My 322 unit (BETA_OET1_0000322_xEBGP-FA_rig_0488_0) has done it's first checkpoint after 40mins CPU but at least the % complete did keep increasing upto the checkpoint and is also continuing now. Win7 64bit with i7.
Had a 318 unit and suspended with LAIM off and it restarted fine with regular checkpoints every few minutes.
----------------------------------------
[Feb 6, 2015 11:43:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
TPCBF
Master Cruncher
USA
Joined: Jan 2, 2011
Post Count: 2173
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

Got 50 WUs over all hosts so far, those that are already finished (1 valid, 7 PVa) range from 0.30h to 1.4h of runtime so far...

Ralf
[Feb 7, 2015 12:51:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Yarensc
Advanced Cruncher
USA
Joined: Sep 24, 2011
Post Count: 136
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

EDIT: Nevermind, remembered this is default BOINC behavior

I had one from batch 322 that had not checkpointed yet at around 8 min CPU time. I suspended/resumed it with LAIM off and for some reason the progress % and CPU time didn't go back down to 0. The only other beta I have on this machine had already checkpointed, but it did correctly revert back to the last checkpoint.
----------------------------------------
[Edit 1 times, last edit by Yarensc at Feb 7, 2015 2:01:42 AM]
[Feb 7, 2015 1:50:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
NorthernRaider
Cruncher
Canada
Joined: Dec 10, 2008
Post Count: 12
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

Done Very Fast and Valid as you can see....
Result Log

Result Name: BETA_ OET1_ 0000320_ xZAGP-FW_ rig_ 1455_ 0--
<core_client_version>7.4.23</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[14:34:52] Number of tasks = 1
[14:34:52] Running task 0,CPU time at start of task 0 was 0.000000
[14:34:52] ./ZINC13281671.pdbqt size = 16 1 ../../projects/www.worldcommunitygrid.org/beta20.xZAGP-FW_rig.pdbqt size = 2296 0
[14:38:52] Finished task #0 cpu time used 165.748000
14:38:52 (27149): called boinc_finish

</stderr_txt>
]]>

2nd One also Valid cpu time is

Result Log

Result Name: BETA_ OET1_ 0000322_ xEBGP-FA_ rig_ 9486_ 0--
<core_client_version>7.4.23</core_client_version>
<![CDATA[
<stderr_txt>
INFO: No state to restore. Start from the beginning.
[14:34:52] Number of tasks = 1
[14:34:52] Running task 0,CPU time at start of task 0 was 0.000000
[14:34:52] ./ZINC24981830.pdbqt size = 20 6 ../../projects/www.worldcommunitygrid.org/beta20.xEBGP-FA_rig.pdbqt size = 2451 0
[15:53:29] Finished task #0 cpu time used 3278.844000
15:53:29 (27150): called boinc_finish

</stderr_txt>
]]>
==================================
Completed these WU's and got 5 more WU's from 323

BETA_ OET1_ 0000323_ xEBGP-L_ rig_ 2551_ 0-- Jabbah Pending Validation 2/7/15 00:12:57 2/7/15 02:01:19 0.48 / 0.49 12.6 / 0.0
BETA_ OET1_ 0000322_ xEBGP-FA_ rig_ 9479_ 0-- Jabbah Valid 2/6/15 22:03:13 2/7/15 02:22:11 1.23 / 1.24 32.0 / 33.2
BETA_ OET1_ 0000322_ xEBGP-FA_ rig_ 9486_ 0-- Jabbah Valid 2/6/15 22:03:13 2/7/15 02:01:19 0.91 / 0.92 23.7 / 24.5
BETA_ OET1_ 0000320_ xZAGP-FW_ rig_ 1455_ 0-- Jabbah Valid 2/6/15 20:47:45 2/7/15 00:41:20 0.05 / 0.05 1.2 / 1
----------------------------------------


----------------------------------------
[Edit 2 times, last edit by DutchRaider at Feb 7, 2015 2:41:26 AM]
[Feb 7, 2015 2:30:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7844
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Beta Test - Outsmart Ebola Together - v7.19 - Feb 6, 2015 [ Issues Thread ]

Got several on both Windows and Linux. Suspend and resume are problem free. They start just where they left off. A very wide variety of times, from 6 minutes on a q6600 Windows to what looks like a couple hours on a Xeon 5405 on Linux. Even within the same batch of 322's on an AMD 9150e they vary from about 2 hours to what looks to be around 9 hours.

Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Feb 7, 2015 2:36:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 36   Pages: 4   [ Previous Page | 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread