Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 9
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2487 times and has 8 replies Next Thread
Simplex0
Advanced Cruncher
Sweden
Joined: Aug 14, 2008
Post Count: 83
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
HCC out of work again & again.

HCC seams to be out of work frequently, time for a server upgrade?


2012-10-14 09:31:49 | World Community Grid | Sending scheduler request: To fetch work.
2012-10-14 09:31:49 | World Community Grid | Reporting 19 completed tasks, requesting new tasks for ATI
2012-10-14 09:31:52 | World Community Grid | Computation for task X0900069690073200605191426_0 finished
2012-10-14 09:31:52 | World Community Grid | Starting task X0900069690069200605191426_0 using hcc1 version 656 (ati_hcc1) in slot 2
2012-10-14 09:31:54 | World Community Grid | Started upload of X0900069690073200605191426_0_0
2012-10-14 09:31:55 | World Community Grid | Scheduler request completed: got 0 new tasks
2012-10-14 09:31:55 | World Community Grid | No tasks sent
----------------------------------------
[Edit 1 times, last edit by Simplex0 at Oct 14, 2012 8:36:55 AM]
[Oct 14, 2012 8:36:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Robokapp
Senior Cruncher
Joined: Feb 6, 2012
Post Count: 264
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

well, the tasks are so short an quick and so 'new' that i imagine everyone is just starved for them.

one of my computers pulled 50 or so files. because they're so small. 3-4 minutes. So with a few hunded thousand computers all starting to crunch and trying to fill their lists, that's tens of millions of Units requested all at the same time.

it just fell behind. it will normalize. give it time for queues to fill up.
[Oct 14, 2012 8:48:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

No, it's the limiting in slots allocated to HCC on GPU. Set the "Connect every..." (client 6) or "Minimum work buffer" (client 7) to 0.01 (the smallest value), and the frequency of work fetch requests is maximum, *before* a thread runs dry. Maybe that helps.

edit: response to Simplex0
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 14, 2012 8:56:02 AM]
[Oct 14, 2012 8:54:15 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Robokapp
Senior Cruncher
Joined: Feb 6, 2012
Post Count: 264
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

Rob if it connects every 0.01 days wont it spam the client, reach cap, do nothing for 22 hours and 19 minutes?

edit: I highlighted a GPU WU at the bottom and it seems to pull a fresh one every time it computes one...maybe it jut doesnt like having more than a certain number?
----------------------------------------
[Edit 1 times, last edit by Robokapp at Oct 14, 2012 9:25:28 AM]
[Oct 14, 2012 9:24:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

Yes and no, it's not really liked, preferred a bunch of Ready to Report are send up combined with a single grouped work fetch, but it's better than idling client cores. There's work enough, but other sciences also need to get a chance to serve the volunteer computers. Work fetches serving is the prime directive I think ... no cap on the quantity of those :D.

The 0.01 frequency is not a "I'm allowed so I will". If the buffer [sum of work being computed plus work ready to start], is above the 0.01 value, there wont be an attempted fetch request. That said, someone reported to have walked into a 164 quota capping for a device. I calculate that a single GPU card must be running them faster than 8.9 minutes to reach that point in a day. With setting more threads per card [at your own risk], or multiple cards in a device, or a fast card that's well possible.

Yes, the limit of "In progress" is set to 10 for HCC per processor thread. Not informed if this is a CPU only restriction.
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 15, 2012 9:55:27 PM]
[Oct 14, 2012 9:37:31 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Simplex0
Advanced Cruncher
Sweden
Joined: Aug 14, 2008
Post Count: 83
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

Ok. so the only way to solve the problem for the moment is to hammer the server by setting 'Minimum work buffer' to 1 or higher.
----------------------------------------
[Edit 1 times, last edit by Simplex0 at Oct 14, 2012 10:05:27 AM]
[Oct 14, 2012 10:04:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Jim1348
Veteran Cruncher
USA
Joined: Jul 13, 2009
Post Count: 1066
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

Are you using two AMD cards (or two Nvidia cards would produce the same result), and also use a cc_config.xml file to exclude one of them? Then the normal work buffer is not used, for some obscure reason.
http://boinc.berkeley.edu/dev/forum_thread.php?id=7796&nowrap=true#45350

We ran into this on POEM a short while ago.
----------------------------------------
[Edit 1 times, last edit by Jim1348 at Oct 14, 2012 10:38:39 AM]
[Oct 14, 2012 10:37:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Simplex0
Advanced Cruncher
Sweden
Joined: Aug 14, 2008
Post Count: 83
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

I'm using 2 AMD cards in one computer and no app-info file and one AMD card in an other computer with an app_info file and no cc_config file att all so no cards is exluded from projects in BOINC
----------------------------------------
[Edit 1 times, last edit by Simplex0 at Oct 14, 2012 10:47:28 AM]
[Oct 14, 2012 10:46:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
knreed
Former World Community Grid Tech
Joined: Nov 8, 2004
Post Count: 4504
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: HCC out of work again & again.

A few background details:

Work is being sent out at a rate of about 9 per second.
HCC1 has 45 slots assigned to it in our shared memory (the buffer where the server checks for work to send to fill a client's request).
It therefore takes about 5 seconds to empty the shared memory slots for the project.

Over the weekend, our scripts did their work and filled a 1.5 day buffer for work ready to send for the project. That means that there are about 1.1 million jobs ready to send out in the database for the project. As a result, when it goes to fetch the next set of jobs to put into the shared memory it is taking a little too long to load (about 6 seconds at times). This means that the shared memory is periodically going empty. We have changed how much is cached down to 1 day and we are doing a few other things to fix this.

At the moment though this means that you will periodically get no work messages.
----------------------------------------
[Edit 1 times, last edit by knreed at Oct 15, 2012 9:52:01 PM]
[Oct 15, 2012 9:51:38 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread