Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 23
Posts: 23   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 8212 times and has 22 replies Next Thread
mikey
Veteran Cruncher
Joined: May 10, 2009
Post Count: 824
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

Had several with very long TTC times, then 2/3rd in the remaining time started declining rapidly. At any rate, the project chart hints at having had the worst of the step up. Now it's at 7.91 hours mean from 5.4 some 4 days ago. With CW and CMD generating larger numbers of shorties to offset, the WCG mean has barely moved (light blue line at bottom): http://bit.ly/WCGALL . Think the techs still think it's too low in prepping for when those GPU race machines start entering the arena, sometime anytime... the scheduler could be doing overtime. :D

--//--


Didn't they say over in that thread though that they can handle any server load increase if/when the gpu unit come online? Increasing the other units run times doesn't hardly seem fair or equitable.
----------------------------------------


[Dec 18, 2011 4:08:29 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

It's an intermediate step I'm suspecting, for the WIP server upgrade apparently brings additional processes with it too [source knreed]. Better save then sorry and have the whole fall over or having to suspend/slow down projects. Once that storm blows over, as in past, the techs have sized down again if they could... maybe CW T6 is double runtime again to relieve things and by the time GPU is there the shorter average running HCMD2 is likely gone. Most like moderate length fairly even / short check-pointing tasks. The 24/7 runners care to a lesser extend... I don't so much either, long as I can control the Write to Disk frequency.

And, if there is a hint, maybe we [me] sees things flying, and will we see a similar decaying trend as what that light blue line shows on the WCGFAM chart between Nov.23 and Dec. 13.

Anyway, the techs do a 10-12 ball juggle act... we cant help them with that and have to see what time brings.

--//--
[Dec 18, 2011 6:30:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dark Angel
Veteran Cruncher
Australia
Joined: Nov 11, 2005
Post Count: 728
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

They work, they validate, I'm getting fair credit for them, I'm happy. :) That's as good as it gets.
----------------------------------------

Currently being moderated under false pretences
[Dec 19, 2011 8:13:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

There's the anticipated tail off. This mornings mean run times are insignificantly over those of Sunday. Here's the 6 day list of runtimes in hours:

14th 5,39594
15th 5,71376
16th 6,73718
17th 7,50167
18th 7,97670
19th 8,07213 (prelim)

Let's see if now the slow decline will happen... a pattern ensues.

--//--

P.S. In my world, decimals are depicted by a comma.
[Dec 19, 2011 5:54:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
E. Frijters
Senior Cruncher
The Netherlands
Joined: Apr 26, 2007
Post Count: 228
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

I now have one malaria wu that has been running for 48 hours straight and needs 104 (!!!) hours more...

One leichmaniasis is running for 25 hours and need 47 hours more... the rest of the wu's run normal schedules...

Both wu's do not generate errors, they just seem to be larger... I will investigate later if these wu's are physically larger on disc as well...

[update]: two processes were running slower then all others. After a reboot everything is fine again.
----------------------------------------
Former grid.org slave


----------------------------------------
[Edit 1 times, last edit by E. Frijters at Dec 28, 2011 1:13:07 PM]
[Dec 28, 2011 9:08:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

Until now, I did not experience any duration troubles with GFAM WUs.
I have at this time around 18 cores crunching for GFAM without any significant problems (only from time to time one invalid WU).
For my-self, the announced durations look strange, excepted if the involved hosts have very poor performance (e.g. PII or PIII or some old Athlon).
Cheers,
Yves
---
PS: I noticed for GFAM that the period between checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute".
----------------------------------------
[Dec 28, 2011 9:58:52 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

Hello KerSamson,
PS: I noticed for GFAM that the period between checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute".

Checkpoints occur whenever the algorithm reaches a checkpoint subroutine. The checkpoint setting stops the program from actually writing a checkpoint if the set time has not passed - that is, your setting will not allow checkpoints to write twice a minute.

Lawrence
[Dec 29, 2011 2:03:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

Just a comment here...
...checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute"
•KerSamson [Dec 28, 2011 9:58:52 AM] post.

...that is, your setting will not allow checkpoints to write twice a minute.
•lawrencehardin [Dec 29, 2011 2:03:22 AM] post
I understood that KerSamson was expressing a concern on the 'long side' of checkpointing rather than on the 'short side' (lawrencehardin's response): too slow rather than too fast.
;
[Dec 29, 2011 2:12:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

Just a comment here...
...checkpoints could be around 8 minutes long, even if the checkpoint setting is "every minute"
•KerSamson [Dec 28, 2011 9:58:52 AM] post.

...that is, your setting will not allow checkpoints to write twice a minute.
•lawrencehardin [Dec 29, 2011 2:03:22 AM] post
I understood that KerSamson was expressing a concern on the 'long side' of checkpointing rather than on the 'short side' (lawrencehardin's response): too slow rather than too fast.
;


Actually, Lawrence did respond to the question properly, it's in this statement "Checkpoints occur whenever the algorithm reaches a checkpoint subroutine."

If the algorithm has not reached a natural checkpoint, it will not write one until it does. In this case "around 8 minutes long".
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Dec 29, 2011 5:56:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Dramatic runtime increase.

The essence of the query by KerSamson can be paraphrased as: Why is KerSamson's case getting 8minutes between checkpoints despite the checkpoint setting set to every minute.

The expectation is: 1-checkpoint every 1-minute.
The reality is: 1-checkpoint every 8-minutes.
The KerSamson query is: Why 8-minutes for a setting of 1-minute.

To respond by indicating that, ...
"If the algorithm has not reached a natural checkpoint, it will not write one until it does. In this case "around 8 minutes long"
... does not answer the query of KerSamson.

But it does answer a question not asked: How does checkpointing work?
;
[Dec 29, 2011 7:14:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 23   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread