| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 23
|
|
| Author |
|
|
mikey
Veteran Cruncher Joined: May 10, 2009 Post Count: 824 Status: Offline Project Badges:
|
Had several with very long TTC times, then 2/3rd in the remaining time started declining rapidly. At any rate, the project chart hints at having had the worst of the step up. Now it's at 7.91 hours mean from 5.4 some 4 days ago. With CW and CMD generating larger numbers of shorties to offset, the WCG mean has barely moved (light blue line at bottom): http://bit.ly/WCGALL . Think the techs still think it's too low in prepping for when those GPU race machines start entering the arena, sometime anytime... the scheduler could be doing overtime. :D --//-- Didn't they say over in that thread though that they can handle any server load increase if/when the gpu unit come online? Increasing the other units run times doesn't hardly seem fair or equitable. ![]() ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It's an intermediate step I'm suspecting, for the WIP server upgrade apparently brings additional processes with it too [source knreed]. Better save then sorry and have the whole fall over or having to suspend/slow down projects. Once that storm blows over, as in past, the techs have sized down again if they could... maybe CW T6 is double runtime again to relieve things and by the time GPU is there the shorter average running HCMD2 is likely gone. Most like moderate length fairly even / short check-pointing tasks. The 24/7 runners care to a lesser extend... I don't so much either, long as I can control the Write to Disk frequency.
And, if there is a hint, maybe we [me] sees things flying, and will we see a similar decaying trend as what that light blue line shows on the WCGFAM chart between Nov.23 and Dec. 13. Anyway, the techs do a 10-12 ball juggle act... we cant help them with that and have to see what time brings. --//-- |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
They work, they validate, I'm getting fair credit for them, I'm happy. :) That's as good as it gets.
----------------------------------------![]() Currently being moderated under false pretences |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There's the anticipated tail off. This mornings mean run times are insignificantly over those of Sunday. Here's the 6 day list of runtimes in hours:
14th 5,39594 15th 5,71376 16th 6,73718 17th 7,50167 18th 7,97670 19th 8,07213 (prelim) Let's see if now the slow decline will happen... a pattern ensues. --//-- P.S. In my world, decimals are depicted by a comma. |
||
|
|
E. Frijters
Senior Cruncher The Netherlands Joined: Apr 26, 2007 Post Count: 228 Status: Offline Project Badges:
|
I now have one malaria wu that has been running for 48 hours straight and needs 104 (!!!) hours more...
----------------------------------------One leichmaniasis is running for 25 hours and need 47 hours more... the rest of the wu's run normal schedules... Both wu's do not generate errors, they just seem to be larger... I will investigate later if these wu's are physically larger on disc as well... [update]: two processes were running slower then all others. After a reboot everything is fine again.
Former grid.org slave
----------------------------------------![]() ![]() [Edit 1 times, last edit by E. Frijters at Dec 28, 2011 1:13:07 PM] |
||
|
|
KerSamson
Master Cruncher Switzerland Joined: Jan 29, 2007 Post Count: 1684 Status: Offline Project Badges:
|
Until now, I did not experience any duration troubles with GFAM WUs.
----------------------------------------I have at this time around 18 cores crunching for GFAM without any significant problems (only from time to time one invalid WU). For my-self, the announced durations look strange, excepted if the involved hosts have very poor performance (e.g. PII or PIII or some old Athlon). Cheers, Yves --- PS: I noticed for GFAM that the period between checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute". |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello KerSamson,
PS: I noticed for GFAM that the period between checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute". Checkpoints occur whenever the algorithm reaches a checkpoint subroutine. The checkpoint setting stops the program from actually writing a checkpoint if the set time has not passed - that is, your setting will not allow checkpoints to write twice a minute. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just a comment here...
...checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute" I understood that KerSamson was expressing a concern on the 'long side' of checkpointing rather than on the 'short side' (lawrencehardin's response): too slow rather than too fast.•KerSamson [Dec 28, 2011 9:58:52 AM] post. ...that is, your setting will not allow checkpoints to write twice a minute. •lawrencehardin [Dec 29, 2011 2:03:22 AM] post ; |
||
|
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
Just a comment here... ...checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute" I understood that KerSamson was expressing a concern on the 'long side' of checkpointing rather than on the 'short side' (lawrencehardin's response): too slow rather than too fast.•KerSamson [Dec 28, 2011 9:58:52 AM] post. ...that is, your setting will not allow checkpoints to write twice a minute. •lawrencehardin [Dec 29, 2011 2:03:22 AM] post ; Actually, Lawrence did respond to the question properly, it's in this statement "Checkpoints occur whenever the algorithm reaches a checkpoint subroutine." If the algorithm has not reached a natural checkpoint, it will not write one until it does. In this case "around 8 minutes long". ![]() Distributed computing volunteer since September 27, 2000 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The essence of the query by KerSamson can be paraphrased as: Why is KerSamson's case getting 8minutes between checkpoints despite the checkpoint setting set to every minute.
The expectation is: 1-checkpoint every 1-minute. The reality is: 1-checkpoint every 8-minutes. The KerSamson query is: Why 8-minutes for a setting of 1-minute. To respond by indicating that, ... "If the algorithm has not reached a natural checkpoint, it will not write one until it does. In this case "around 8 minutes long" ... does not answer the query of KerSamson. But it does answer a question not asked: How does checkpointing work? ; |
||
|
|
|