Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: Beta Testing - HCMD Phase 2, v 6.12 May 21, 2009 |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 8
|
Author |
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
So went for a late walk about and left 2 clients with a cache greater than 2 days in anticipation after a comment on size testing. On return, find 20 Beta queued with run times of 5.5 to 7 hours, the first one nearly finished at 1:25. All with a deadline of 2 days.
----------------------------------------Thank you. PS, on the side got 3 CEP jobs too though with rush deadlines of 3 days Edit: Oops, 27 as 7 were already finished
WCG Global & Research > Make Proposal Help: Start Here!
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at May 21, 2009 9:43:30 PM] |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
We better get used to a new, yet not alarming message in the Result Log:
----------------------------------------<core_client_version>6.6.28</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. Finishing early because max runtime has been exceeded.0 called boinc_finish </stderr_txt> ]]> and <core_client_version>6.6.24</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. Finishing early because max runtime has been exceeded.1601933466 called boinc_finish </stderr_txt> ]]> it's a guess why one has 0 seconds (?) left on termination.
WCG Global & Research > Make Proposal Help: Start Here!
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at May 21, 2009 9:43:56 PM] |
||
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
I will be very curious as how the validation is going to work if one computer in a quorum manages to run the job to the end and the other is cut short exceeding the maximum run time. The below sample is where the top one was cut short and the other 2 have a normal log without 'exceed' message, seemingly passing the > 60% completion test rule [read somewhere].
----------------------------------------BETA_ CMD2_ 0001-GPDAA.clustersOccur-KIF3AA.clustersOccur_ 46_ 2-- 612 Pending Validation 21-5-09 17:38:13 21-5-09 21:13:29 1.00 6.8 / 0.0 BETA_ CMD2_ 0001-GPDAA.clustersOccur-KIF3AA.clustersOccur_ 46_ 1-- 612 Pending Validation 21-5-09 17:37:53 21-5-09 21:30:16 1.50 21.8 / 0.0 BETA_ CMD2_ 0001-GPDAA.clustersOccur-KIF3AA.clustersOccur_ 46_ 0-- 612 Pending Validation 21-5-09 17:37:44 21-5-09 21:08:18 1.37 23.4 / 0.0
WCG Global & Research > Make Proposal Help: Start Here!
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at May 21, 2009 9:46:55 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Registro de Resultados
<core_client_version>6.2.28</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. No heartbeat from core client for 30 sec - exiting Finishing early because max runtime has been exceeded.1284009679 called boinc_finish </stderr_txt> ]]> But the result is valid. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I came home from work yesterday and found all four quads packed with betas. I had a cache of five days set and had CEP set so I could finish off my year I usually do for each project.
----------------------------------------Little did I know that CEP is sending very few units out, and the ones they did send were not running as the betas were taking higher priority. Polished off 5 days of runtime in beta units just last night. Each quad computer has 40-50 betas in them running 1.5 hours approx each. The dual has about 6 betas in the queue at any one time. There are alot of beta units floating around today. I have not had any of them error out or become inconclusive. A couple have project aborted, I assume they were no longer needed by the server. k.t. [Edit 1 times, last edit by Former Member at May 22, 2009 1:47:49 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I will be very curious as how the validation is going to work if one computer in a quorum manages to run the job to the end and the other is cut short exceeding the maximum run time. This is exactly the question I also had in mind. We will always have different speeds in a quorum. In the easiest case this will only result in different runtimes til the end of the WU. But it gets difficult when one runs to the end and another one gets cut off. And even when all (in production = both?) get cut off, it will for sure be at different numbers of positions calculated. Will all the slower machines cause chopping of the WUs which faster machines could do alone? My slowest machine needs 3x more time than my fastest. And even my fastest is slow compared to what others have here. One way to get rid of the validation problem would be to run with zero redundancy, but this does not necessarily prevent the chopping caused by very slow machines. It would be very interesting how this is solved. Perhaps knreed could provide more insight. Greetings Thorsten |
||
|
nasher
Veteran Cruncher USA Joined: Dec 2, 2005 Post Count: 1422 Status: Offline Project Badges: |
yes this will be interesting to see how these play out..
----------------------------------------but me personally im just happy to get beta work... got a total of 5 days built up and i want at least my bronze badge... sometime.. please.... |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
We are simply validating the number of positions that are computed in common. The validated positions are then archived and child workunits are created in order to complete the workunit.
We are awarding credit based on the total number of positions computed. Thus two results in the same quorum could wind up with different credit awarded. |
||
|
|