Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 21
Posts: 21   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2915 times and has 20 replies Next Thread
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

Wrong beta thread!

Yes, I did notice it this morning (night working is not always good wink ).
Yves
----------------------------------------
[Mar 18, 2016 7:33:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1317
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

I tested with task BETA_HST1_000002_000137_AC0013_T300_F00037_S00001_0 on Win7
a few times the suspend (LAIM off)/resume problem with the previous beta-batch.

The problem is solved. Now normal checkpointing and progress after the resumes.
Endless running therefore will also be OK, but can't report yet.
Progress 20% after 4 hours runtime and several CEP BETA's running High Priority.
I got more of those BETA_E's than available threads.
----------------------------------------

[Mar 18, 2016 8:45:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

I got more of those BETA_E's than available threads.

[ot]There's a dirty trick for the connoisseurs, but there's also an oddball something which is coupled to how the CEP2 settings are in the device profile, to cause for more CEP2 betas to come than the standard rule of max = 1 Beta task * threads. Set my profile to 16 CEP2 allowed but still only 8 came to the octo and then 'no work available for', so not figured out what the exact oddball override settings are. As yours seem to work, don't change what's broken, if you have no problem with it ;O)[/ot]
[Mar 18, 2016 9:19:08 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Eric_Kaiser
Veteran Cruncher
Germany (Hessen)
Joined: May 7, 2013
Post Count: 1047
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

Caught a bunch of the new beta.
The short runnings start with BETA_AC* and the long runnings with BETA_HST1*.
Checkpointing works for both types of beta wu. Stopping and (re)starting is ok.
The short runnings were finished in ~2,3 hrs. The long runnings are still running. My personnel estimation on runtime is something around 15 hrs per wu. On my windows/linux hosts they show 3 hrs for 20% progress.
----------------------------------------

[Mar 18, 2016 9:40:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jason1478963
Senior Cruncher
United States
Joined: Sep 18, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

Not sure why but 11 (24 NOW) of the AC0002 wu's from 5 different machines came up invalid after what seemed like a smooth run.

Result Name: BETA_ AC0002_ T000_ F00098_ S00001am_ 1--


<core_client_version>7.6.9</core_client_version>
<![CDATA[
<stderr_txt>
simulation
[04:26:33] INFO: Completed step 3982000 of initial simulation
[04:26:35] INFO: Completed step 3983000 of initial simulation
[04:26:37] INFO: Completed step 3984000 of initial simulation
[04:26:39] INFO: Completed step 3985000 of initial simulation
[04:26:41] INFO: Completed step 3986000 of initial simulation
~~~~~~~~~~~~~~~~~~~~~
Shortened up
~~~~~~~~~~~~~~~~~~~~~
[05:02:06] INFO: Completed step 4994000 of initial simulation
[05:02:08] INFO: Completed step 4995000 of initial simulation
[05:02:10] INFO: Completed step 4996000 of initial simulation
[05:02:12] INFO: Completed step 4997000 of initial simulation
[05:02:14] INFO: Completed step 4998000 of initial simulation
[05:02:16] INFO: Completed step 4999000 of initial simulation
[05:02:18] INFO: Completed step 5000000 of initial simulation
Writing checkpoint at step 5000000.
[05:02:19] INFO: Finished initial simulation.
[05:02:19] INFO: Running secondary simulation
[05:02:21] INFO: Run complete, CPU time: 10232.593750
05:02:21 (6256): called boinc_finish(0)

</stderr_txt>
]]>
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by Jason1478963 at Mar 21, 2016 2:02:37 AM]
[Mar 19, 2016 2:49:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

The BETA_ AC0002 units still have Result Log info written every second and truncated at the start:

Result Name: BETA_ AC0002_ T000_ F00029_ S00001ab_ 1--
<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>
ted step 3982000 of initial simulation
[03:16:49] INFO: Completed step 3983000 of initial simulation
[03:16:50] INFO: Completed step 3984000 of initial simulation
[03:16:51] INFO: Completed step 3985000 of initial simulation
[03:16:52] INFO: Completed step 3986000 of initial simulation
etc.
[Mar 19, 2016 8:44:59 AM]   Link   Report threatening or abusive post: please login first  Go to top 
SekeRob
Master Cruncher
Joined: Jan 7, 2013
Post Count: 2741
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

Could this be intentional, to get extra log data during beta and is it writing to storage every second? Anything hitting storage in production with output every second is bad, bad. Picture what happens when you got a 8-16-24 and more cores running these concurrent.
----------------------------------------
[Edit 1 times, last edit by SekeRob* at Mar 19, 2016 9:14:23 AM]
[Mar 19, 2016 9:13:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jason1478963
Senior Cruncher
United States
Joined: Sep 18, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

Could this be intentional, to get extra log data during beta and is it writing to storage every second? Anything hitting storage in production with output every second is bad, bad. Picture what happens when you got a 8-16-24 and more cores running these concurrent.


It certainly isn't good, I lost a hdd to the CEP2 beta run, and of course caused the creation of another device ID, starting from scratch again. :(
----------------------------------------

[Mar 19, 2016 1:37:59 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]



It certainly isn't good, I lost a hdd to the CEP2 beta run, and of course caused the creation of another device ID, starting from scratch again. :(

I feel your pain. Same thing happened to me during the CEP1 BETA. I recommend doing backups at least every 2 weeks and more often when BETAs are/have been running. A must to avoid your issue.

Could this be intentional, to get extra log data during beta and is it writing to storage every second? Anything hitting storage in production with output every second is bad, bad. Picture what happens when you got a 8-16-24 and more cores running these concurrent.

Makes for a flashing green and red light show. biggrin
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


----------------------------------------
[Edit 1 times, last edit by nanoprobe at Mar 19, 2016 3:56:07 PM]
[Mar 19, 2016 3:51:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test - March 17, 2016 [ Issues Thread ]

One Invalid and 2 Valid from this unit. I can't see any significant difference between the Result Logs.

BETA_ HST1_ 000002_ 000022_ AC0012_ T300_ F00022_ S00001_ 2-- Microsoft Windows 7 Professional x64 Edition, Service Pack 1, (06.01.7601.00) 713 Valid 19/03/16 18:31:11 20/03/16 04:16:10 9.37 320.9 / 327.8
BETA_ HST1_ 000002_ 000022_ AC0012_ T300_ F00022_ S00001_ 1-- Microsoft Windows 8.1 Professional x64 Edition, (06.03.9600.00) 713 Valid 17/03/16 21:55:50 19/03/16 06:15:07 11.02 334.7 / 327.8
BETA_ HST1_ 000002_ 000022_ AC0012_ T300_ F00022_ S00001_ 0-- Microsoft Windows 7 Professional x64 Edition, Service Pack 1, (06.01.7601.00) 713 Invalid 17/03/16 21:55:39 19/03/16 18:31:03 17.39 348.4 / 327.8
[Mar 20, 2016 8:27:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 21   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread