Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 14598
|
![]() |
Author |
|
genhos
Veteran Cruncher UK Joined: Apr 26, 2009 Post Count: 1103 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Watch the BETA units guys.
----------------------------------------I've just had to suspend 2 that went for nearly 2 hours with no progress in %. Time left decreased down to 0 but the unit just kept on crunching. The stderr.txt file had only a single line in it "Unable to open checkpoint file starting from 0". |
||
|
jonnieb-uk
Ace Cruncher England Joined: Nov 30, 2011 Post Count: 6105 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We will be sending out about 4500 work units with a quorum of 2. These will have similar runtimes to the last beta test. ugm1_00010 - 30 minutes ugm1_00011 - 4 hours ugm1_00012 - 1.5 hours I picked up 4 x 00011. All crunching happily but no checkpointing! ![]() Although the guide says 4 hrs completion I estimate 4h30m; 2h55m; 1h48m; 1h12m. All ~93.5% efficiency on a "used" machine. Other than no checkpoints the BETA thread seems to indicate that the problems are in the 00012 WUs. One comment - After 4h CPU time, mine are showing >99.7% progress but those figures are changing very slowly. seems to be happening on my WUs ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Mortnin..............
![]() |
||
|
Mamajuanauk
Master Cruncher United Kingdom Joined: Dec 15, 2012 Post Count: 1900 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Morning All...
----------------------------------------
Mamajuanauk is the Name! Crunching is the Game!
![]() ![]() |
||
|
jonnieb-uk
Ace Cruncher England Joined: Nov 30, 2011 Post Count: 6105 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Congratulations to the following UK team members on achieving a Personal Milestone in yesterday's crunching:
----------------------------------------![]() ![]() Gilarm 51 days RunTime for the UK team Mamajuanauk moves above 0,600 in the RunTime Rankings to #0,600 Mamajuanauk moves into the UK top 8 RunTime Rankings at #7 Edstar passes 200,000 Points for the UK team Labinopper moves above 15,000 in the Points Rankings to #14,882 Darkmatter.NI 6,000 Results Mamajuanauk passes 90,000 Results for the UK team Gilarm passes 200 Results for the UK team Congratulations also to the following UK team members on setting new PBs ![]() ![]() Gilarm Runtime Points 23,581 Results UK team Comparison of Daily RunTime, Points, Results Hours Points Results Average Daily Crunching Comparison RunTime Points Results Milestone Targets for the UK team Target Current To Do 7day Avg. Estimate No. Of Members Active Yesterday Day out of Tot. |
||
|
jonnieb-uk
Ace Cruncher England Joined: Nov 30, 2011 Post Count: 6105 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Daily Global5000
----------------------------------------RT(days) # Points # Results # The Daily Global5000 accounted for 72.9% of yesterday's RunTime of 405.7 years The UK Team accounted for 0.47% ---------------------------------------- [Edit 1 times, last edit by jonnieb-uk at Sep 19, 2014 9:04:52 AM] |
||
|
jonnieb-uk
Ace Cruncher England Joined: Nov 30, 2011 Post Count: 6105 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The UK Team has met it's Waterloo (or at least the University of) and advanced to #54 in the RunTime Rankings with the prospect of gaining another place early next week:
----------------------------------------Targets] |
||
|
jonnieb-uk
Ace Cruncher England Joined: Nov 30, 2011 Post Count: 6105 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We will be sending out about 4500 work units with a quorum of 2. These will have similar runtimes to the last beta test. ugm1_00010 - 30 minutes ugm1_00011 - 4 hours ugm1_00012 - 1.5 hours I picked up 4 x 00011. All crunching happily but no checkpointing! ![]() Although the guide says 4 hrs completion I estimate 4h30m; 2h55m; 1h48m; 1h12m. All ~93.5% efficiency on a "used" machine. Other than no checkpoints the BETA thread seems to indicate that the problems are in the 00012 WUs. One comment - After 4h CPU time, mine are showing >99.7% progress but those figures are changing very slowly. seems to be happening on my WUs ![]() This has to be one of the worst BETA tests in recent times! ![]() The thread has a few isolated reports of checkpointing and WUs completing but the vast majority report WUs reaching >99% but not finishing. like trying to compute the umpteen fraction to find the perfect pi. Significant numbers of WUs are being suspended or aborted! No official comment as yet ![]() All my 4 BETAs reached >99% over night before I suspended them. I note that none of my wingmen have returned completed work and suggest everyone should suspend BETA work until official guidance is issued. |
||
|
Thargor
Veteran Cruncher UK Joined: Feb 3, 2012 Post Count: 1291 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Watch the BETA units guys. I've just had to suspend 2 that went for nearly 2 hours with no progress in %. Time left decreased down to 0 but the unit just kept on crunching. The stderr.txt file had only a single line in it "Unable to open checkpoint file starting from 0". Yep, I have 7 of them running on my 24-thread server, will see what happens if/when they get down to 0:00:00 remaining time. ![]() |
||
|
jonnieb-uk
Ace Cruncher England Joined: Nov 30, 2011 Post Count: 6105 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Words of wisdom from Keith Uplinger
----------------------------------------Re: New BETA test - Sept 18, 2014 [ Issues Thread ] Sorry for the long delay on a response, but we have figured out the root cause for the work units hanging. It is a work unit build problem with some of the input files being improperly formed. These input files were manually changed outside of the build script to change a special character. When more than one special character was encountered in the manual update of them, it changed the length of the line that was expected. Thus it has caused the application to appear stalled. It was technically still working just on data that was a lot longer (1000000x) than normal. We are going to set all the work units currently out there to report as being completed (server_abort). I have disabled the assimilator and validator for the time being. This will allow for the results to stay in the database longer than normal. I will be reviewing the data that members have returned on Monday for these batches and grant credit if someone hits the resource limit (cpu timeout). I will also see about those that manually aborted them, to see if some partial credit for time spent can be given. We changed the build script so that manual intervention on removing the special character is not needed. After we clean up from this current beta, we will be sending out proper work units, no time table on that yet. Thanks, -Uplinger If I understand correctly - take no action (or suspend) a Server abort is imminent! |
||
|
|
![]() |