Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 19
|
![]() |
Author |
|
Mumak
Senior Cruncher Joined: Dec 7, 2012 Post Count: 477 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have a few WUs, which after 15 hours reported "Maximum elapsed time exceeded" and then crashed:
----------------------------------------OET1_ 0000333_ xMBGP-OM_ rig_ 9319_ 0-- OET1_ 0000333_ xMBGP-OM_ rig_ 10958_ 0--
![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Means your machine did a factor 40 longer to complete the task than was originally estimated, and yes many of the 333 are known to run long. After 15 hours failing, the original estimated must then have been less than 22.5 minutes. The project average last few days has been well over one hour, so it's hard to tell how this could have happened.
----------------------------------------Does the device have a variable CPU speed? If the task came at for instance client benchmark 15000 but then was really running at a benchmark speed of 7500, this could occur sooner. At any rate, the enormous variability may be a reason for the technicians to up the </rsc_fpops_bound> to a factor 50. [Edit 1 times, last edit by Former Member at Feb 28, 2015 3:50:03 PM] |
||
|
Mumak
Senior Cruncher Joined: Dec 7, 2012 Post Count: 477 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
That machine runs stable since a few years and it's not a slow one (Core i5-750). It received several 333 and 490 units with estimated runtime of ~24 minutes.
----------------------------------------One issue is why is the estimated runtime so low and the other I believe it should not raise an exception. ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The explanation is in my previous post.... the exceptions are covered through the upper bound factor of 40. Other projects have a factor 5 to 10. The unit must have been -originally- assigned right when a larger set of shorts were influencing the average run time for new work to be that low. This is not expressed in the day averages of which lowest was 0.69 hours on Feb.23, hence the suggestion made to the technicians.
Of course the client benchmark could be heavily optimistic. What are the Whetstone/Dhrystone values and was it Linux or Windows? If a device is claiming to be a Ferrari but really runs as VW, the allowed maximum runtime starts to play a roll. BTW, I've seen tasks with max exceed being credited, just don't know if this is policy. |
||
|
Mumak
Senior Cruncher Joined: Dec 7, 2012 Post Count: 477 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It's a WinXP machine. I don't have results of the past benchmark, but a recent re-run gives:
----------------------------------------2923 floating point MIPS (Whetstone) per CPU 7117 integer MIPS (Dhrystone) per CPU ![]() |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Of course the other question, seeing WinXP, what client version?
|
||
|
Mumak
Senior Cruncher Joined: Dec 7, 2012 Post Count: 477 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
7.2.42
----------------------------------------![]() |
||
|
seippel
Former World Community Grid Tech Joined: Apr 16, 2009 Post Count: 392 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The OET1_ *-OM batches run quite a bit longer than other OET1 batches which causes problems for the estimations. A few weeks ago we decided to hold the OET1_*-OM batches until the other batches have run (and OET1_*-OM batches won't be run on android). A few of the OET1_*-OM batches had already been sent out though. The vast majority of those completed without problems, but a few hit the 'maximum elapsed time exceeded'. Those will just be re-run when we've completed other batches and are only running OET1_*-OM batches.
Seippel |
||
|
KLiK
Master Cruncher Croatia Joined: Nov 13, 2006 Post Count: 3108 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thnx for the heads up! ;)
---------------------------------------- |
||
|
Eric_Kaiser
Veteran Cruncher Germany (Hessen) Joined: May 7, 2013 Post Count: 1047 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
(and OET1_*-OM batches won't be run on android) Does it mean that android devices will be kicked out of this project in the near future? ![]() ![]() ![]() |
||
|
|
![]() |