Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 79
Posts: 79   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 9091 times and has 78 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server maintenance??

MStenholm,

You've not given us information to opine on why your "most productive PC" is limited to 600 tasks "in progress". What we do know is that on the GPU side, the limit is about 4000 ** that a device can have at any one time and on the CPU side [non-GPU tasks] the limit is about 80 per processor [allowed to be used for BOINC]. We also know that if a device produces invalids, the maximum number permitted is cut down on an exponential curve, but at 150 an hour the "maximum allowed" is quickly restored.

Just to test the CPU side, upped the work buffer on my Octo to 6.75 then to 7.75 days, which of course invokes a panic state [Earliest deadline first], and watched the client backfill with only C4CW [3 hour jobs]. Now got over 420, worth 59 CPU days worth on the device. This demonstrates that at the very least my client is granted to have 52 CPU tasks per processor thread. (Techs tune these numbers at times based on general production failure data)

edit: ** For a device that signals the servers it's on and computing practically 24/7. The on_frac/active_frac values in the client_state.xml file of a host indicate if it is measured to do that.
----------------------------------------
[Edit 1 times, last edit by Former Member at Jan 12, 2013 9:10:46 AM]
[Jan 12, 2013 9:04:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
MStenholm
Advanced Cruncher
Denmark
Joined: Jan 7, 2010
Post Count: 97
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

MStenholm,

You've not given us information to opine on why your "most productive PC" is limited to 600 tasks "in progress". What we do know is that on the GPU side, the limit is about 4000 ** that a device can have at any one time and on the CPU side [non-GPU tasks] the limit is about 80 per processor [allowed to be used for BOINC]. We also know that if a device produces invalids, the maximum number permitted is cut down on an exponential curve, but at 150 an hour the "maximum allowed" is quickly restored.

Just to test the CPU side, upped the work buffer on my Octo to 6.75 then to 7.75 days, which of course invokes a panic state [Earliest deadline first], and watched the client backfill with only C4CW [3 hour jobs]. Now got over 420, worth 59 CPU days worth on the device. This demonstrates that at the very least my client is granted to have 52 CPU tasks per processor thread. (Techs tune these numbers at times based on general production failure data)

edit: ** For a device that signals the servers it's on and computing practically 24/7. The on_frac/active_frac values in the client_state.xml file of a host indicate if it is measured to do that.


I run three rigs that does only GPU work. Two is using version 7.0.42 and the minimum work buffer and max additional buffer is set to 2.5 one rig and 4 days on the other. The 2.5 has work for around 28 hours (around 1500 WUs) and the 4 has work for 9 hours (600 WUs) The last one, the critical one uses version 7.0.28 with 3.2 and 3.2 hours setting. This one also has 600 WUs in the buffer and as I wrote that is only enough for 4 hours.

Edit: forgot to tell that they are all 100 % dedicated and is always on.
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by MStenholm at Jan 12, 2013 11:33:56 AM]
[Jan 12, 2013 11:31:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
deltavee
Ace Cruncher
Texas Hill Country
Joined: Nov 17, 2004
Post Count: 4894
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

What we do know is that on the GPU side, the limit is about 4000 ** that a device can have at any one time and on the CPU side [non-GPU tasks] the limit is about 80 per processor [allowed to be used for BOINC]. We also know that if a device produces invalids, the maximum number permitted is cut down on an exponential curve, but at 150 an hour the "maximum allowed" is quickly restored.

edit: ** For a device that signals the servers it's on and computing practically 24/7. The on_frac/active_frac values in the client_state.xml file of a host indicate if it is measured to do that.


I hate to differ with you blushing but I have never been able to get a cache greater than 600 on any of my gpu-only crunchers. They all sit at around 596 consistently. Has anyone received more than 600?
[Jan 12, 2013 1:40:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server maintenance??

You're free to differ with me as you please, deltavee. This is the last official word I could find in the quick: http://www.worldcommunitygrid.org/forums/wcg/...ead,34135_offset,0#399616 . Reads as that is the daily quota number, not the "in progress", revealing my intermittent memory biggrin . Suppose the 600 is what it is atm.

Wonder if there's a mechanism that counts GPU threads like there is CPU thread counting. Certainly the app_info content is fed back to the server, the app_config not yet [will be in a future 7.x release.]

If knreed et al read, they may lend an ear to upping the permission (for high availability, highly reliable hosts).

Edit: As I noted in my previous post, techs tune these numbers (without notice). E.g. the 15 per call was changed, or fails to get enforced as when I upped my buffer to a highly inflated level, the first call gave 57:

9341 World Community Grid 1/12/2013 9:51:29 AM [sched_op] Starting scheduler request
9342 World Community Grid 1/12/2013 9:51:29 AM Sending scheduler request: To fetch work.
9343 World Community Grid 1/12/2013 9:51:29 AM Requesting new tasks for CPU
9344 World Community Grid 1/12/2013 9:51:29 AM [sched_op] CPU work request: 2679587.15 seconds; 0.00 devices
9345 World Community Grid 1/12/2013 9:51:43 AM Scheduler request completed: got 57 new tasks
9346 World Community Grid 1/12/2013 9:51:43 AM [sched_op] Server version 701
9347 World Community Grid 1/12/2013 9:51:43 AM Project requested delay of 11 seconds
9348 World Community Grid 1/12/2013 9:51:43 AM [sched_op] estimated total CPU task duration: 689017 seconds

In past, back in Nov/Dec, even saw one time 86-89 or so being given in 1 fetch.
----------------------------------------
[Edit 2 times, last edit by Former Member at Jan 12, 2013 1:56:48 PM]
[Jan 12, 2013 1:51:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

This one also has 600 WUs in the buffer and as I wrote that is only enough for 4 hours.

I'm curious to know what GPU you have that can process 150 WUs per hour.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Jan 12, 2013 2:12:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

deltavee:
Rigs seem to get 600 PER GPU FITTED.

If you have the capacity then I recommend you run a small card as well as any faster card in order that your cache goes to 1200.

I am running 7950's that are paired with some older 5870's that I had. When running 16 wu's and 3 wu's the times are broadly similar for each card.

If you were to purchase cards for this purpose then maybe something like an hd 6450, which requires no extra 6 pin connection, would suit?

EDIT:

SekeRob:
From observation I would say the 600 number is a pure cache limit and that limit is applied per card. Maybe because I don't have the very top end cards I have never seen any daily limit.

nanoprobe:
Individually and running independently I have 10 day average of 90 per hour on 7950 that is tuned. I wonder what the 7970 is capable of too now, surely not 150???
----------------------------------------

----------------------------------------
[Edit 2 times, last edit by OldChap at Jan 12, 2013 2:32:46 PM]
[Jan 12, 2013 2:12:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
deltavee
Ace Cruncher
Texas Hill Country
Joined: Nov 17, 2004
Post Count: 4894
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

deltavee:
Rigs seem to get 600 PER GPU FITTED.

If you have the capacity then I recommend you run a small card as well as any faster card in order that your cache goes to 1200.

I am running 7950's that are paired with some older 5870's that I had. When running 16 wu's and 3 wu's the times are broadly similar for each card.

If you were to purchase cards for this purpose then maybe something like an hd 6450, which requires no extra 6 pin connection, would suit?


Thanks for the help OldChap. I have an unused 6670 I'll try this out with on a 7970 rig. I'll be shutting everything down this weekend anyway to reapportion my crunchers onto different circuit breakers. I had a breaker trip last week that idled six crunchers and my beer fridge for half a day.
[Jan 12, 2013 3:46:19 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MStenholm
Advanced Cruncher
Denmark
Joined: Jan 7, 2010
Post Count: 97
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

This one also has 600 WUs in the buffer and as I wrote that is only enough for 4 hours.

I'm curious to know what GPU you have that can process 150 WUs per hour.


2 x 7970 slightly OC'ed on a 12 thread 4.1 GHz Intel (2x12 WU). I know I could more if I split them but I ran out of PCs to put them in.
----------------------------------------

[Jan 12, 2013 3:52:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
MStenholm
Advanced Cruncher
Denmark
Joined: Jan 7, 2010
Post Count: 97
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Server maintenance??

deltavee:
Rigs seem to get 600 PER GPU FITTED.


EDIT:

SekeRob:
From observation I would say the 600 number is a pure cache limit and that limit is applied per card. Maybe because I don't have the very top end cards I have never seen any daily limit.


I got two identical cards in my trouble rig....the other max 600 only has one so unless 7.0.28 is different in that respect then the max 600 per GPU might not be correct. Anyways I will update the rig to 7.0.4x and hope for short "maintenance time outs" in the future
----------------------------------------

[Jan 12, 2013 3:59:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Server maintenance??

2 cards, and still only 600 max buffer. Is SLI /CrossFireX or whatever the "merge cards" technology is called, enabled?
[Jan 12, 2013 4:00:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 79   Pages: 8   [ Previous Page | 1 2 3 4 5 6 7 8 | Next Page ]
[ Jump to Last Post ]
Post new Thread