Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 33
Posts: 33   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2363 times and has 32 replies Next Thread
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
BOINC To Completion madness

For the last few days, the To Completion times I'm seeing in BOINC Manager have gone totally crazy. FAAH WU's are showing 21 hours, HDC WU's are 9.5 hours and FCG WU's are 14 hours. Needless to say, I've not gotten any new WU's in close to a couple of days now. I know that BOINC is not really set to deal with multiple projects within one BOINC project (WCG) but could that alone throw this so totally out of whack?
----------------------------------------
Join/Website/IMODB



[Jan 2, 2007 1:54:19 AM]   Link   Report threatening or abusive post: please login first  Go to top 
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

I know that BOINC is not really set to deal with multiple projects within one BOINC project (WCG) but could that alone throw this so totally out of whack?

Yes. I have had HDC units in the last day that have taken 5:29 and 5:40 to complete. Usually my DCF for WCG is 0.97 or so, but because of those two HDC units it is now 1.97 making the to Completion times double what they should be. I now have 2 FCG units in the queue so they should bring the DCF back down some.
[Jan 2, 2007 3:16:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Alther
Former World Community Grid Tech
United States of America
Joined: Sep 30, 2004
Post Count: 414
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

Nothing to worry about. They're just harder workunits.

You also have to remember that TC times are just estimates and the PCT complete for some projects (like FAAH) are just estimates because they are non-deterministic.
----------------------------------------
Rick Alther
Former World Community Grid Developer
[Jan 2, 2007 2:24:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

WU's get an estimated completion time (actually flop estimate) included when send to the client. Thus a HCD should not assume an identical time to e.g. a FA@H and never observed that. The GC are still suffering of a severely over-estimated completion time (flop estimate) when transmitted....factor 2 to 3 of real completion times. That was reported previously.

The spread of times per project like FA@H is very wild indeed.... have swings of 4 to 7.5 CPU hours from one to the next, which is causing havoc on the work buffer algorithm and the DCF that's part of the drivers that make a BOINC agent call or not call for more work. Set your contact time to 1.0 days and you're very unlikely to ever run out of work like e.g. during the up to 2 hour Wednesday and Sunday backup runs of 09:00 UTC (bar disasters).

Edit: Spellchecked ^_^
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jan 2, 2007 6:17:59 PM]
[Jan 2, 2007 5:51:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
schepers
Advanced Cruncher
Canada
Joined: Oct 11, 2006
Post Count: 85
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

Set your contact time to 1.0 days and you're very unlikely to ever run out of work...


I found this one option on the WCG site for BOINC profiles to be very badly worded. On the WCG site it refers to this as how often to contact the site servers, but in the config XML file it is work_buf_min_days which doesn't really mean when to contact but only refers to buffering enough work for "x" time. The two meanings are not the same.

Does changing this option to 1 mean that jobs, when completed, won't be reported for up to 1 day as well, or does this setting not affect the reporting time?
----------------------------------------
[Edit 1 times, last edit by schepers at Jan 2, 2007 6:20:26 PM]
[Jan 2, 2007 6:08:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

The contact x days is implicitly equal to the estimated amount of work that the Agent always tries to buffer. It also is the 'no later' time to report the completed tasks. Manually doing an 'Update Project' will override this and report tasks and get new work to bring the buffer back up to e.g. the exampled 1.0 days level.

If the network available option is set to e.g. 'Always'.... the phase 1 crunched result files are immediately sent. The 'Ready to Report' phase 2 of the task reporting is held until the 'connect' time is reached, but with permanent network available, u will observe that part 2 is done sooner or when hitting the Update button in the project tab. It's not entirely transparent how the drivers really function as described e.g. in the unofficial BOINC wiki. It's not causing sleepless nights and should not... if it does, revert back to the default 0.1 days. Maybe the best of both worlds could be added to the BOINC wish list, but i wont as it defeats the purpose of the Connect Time / Task reporting logic.

The name in the website profile is not same as the one in the xml, but should we dabble in that file at all?
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jan 2, 2007 6:39:38 PM]
[Jan 2, 2007 6:38:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
keithhenry
Ace Cruncher
Senile old farts of the world ....uh.....uh..... nevermind
Joined: Nov 18, 2004
Post Count: 18665
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

Sek, when I saw the reference to the DCF, I remembered that you had discussed that in some of yor prior posts. Did a bit of searching and reading. Checked my client_state.xml file and saw:

<duration_correction_factor>2.030488</duration_correction_factor>

Looked back thru the messages tab and found the last benchmark run:

1/2/2007 12:18:36 AM|World Community Grid|Pausing task faah1119_d143n326_x1MEU_02_2 (removed from memory)
1/2/2007 12:18:36 AM||Suspending network activity - running CPU benchmarks
1/2/2007 12:18:38 AM||Running CPU benchmarks
1/2/2007 12:19:37 AM||Benchmark results:
1/2/2007 12:19:37 AM|| Number of CPUs: 1
1/2/2007 12:19:37 AM|| 1299 floating point MIPS (Whetstone) per CPU
1/2/2007 12:19:37 AM|| 2687 integer MIPS (Dhrystone) per CPU
1/2/2007 12:19:37 AM||Finished CPU benchmarks
1/2/2007 12:19:39 AM||Resuming computation
1/2/2007 12:19:39 AM||Rescheduling CPU: Resuming computation
1/2/2007 12:19:39 AM||Resuming network activity

Forced the benchmarks to run again (yea, I'll want to do that again late tonight to get that out of the daytime) and the benchmarks and DCF changed very minimally. As I understand this, my DCF says that my machine is taking twice as long as expected based on the benchmarks to complete WUs? I would presume that that is CPU time and not clock time? Right now, BOINC thinks my machine is over committed, probably because it went three days without getting any new work. I do recall that within the past week or so, I saw that one of the WU's I got was already marked valid for the other users. It looked like I was the fourth but within less than a minute of the WU being sent to me, it got sent to a fifth, then got its quorum thus showing as valid for those three and still as in progress for me. I aborted that WU as it didn't make any sense to me to crunch a completed WU. I believe that a day or two after that, I reset the project as the aborted WU wouldn't fall off the task tab in BOINC Manager. Given what I think I might understand from reading other posts on the DCF, I am starting to think I just need to give things a few days to crunch and see if things clear up and I get more realistic to completion times.
----------------------------------------
Join/Website/IMODB



[Jan 2, 2007 6:47:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
schepers
Advanced Cruncher
Canada
Joined: Oct 11, 2006
Post Count: 85
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

The contact x days is implicitly equal to the estimated amount of work that the Agent always tries to buffer...


So, basically you are saying that the work units will likely be reported in less time than the config value says. That's fine. My only theoretical concern is if too many completed units get cached, and a problem happens that prevents them from going out before the 1 day is up. I still want them reported as soon as they are done.

The name in the website profile is not same as the one in the xml, but should we dabble in that file at all?


Absolutely, given the poor support of profiles in BOINC. I do it all the time when experimenting with local configs. Also, all my research on the BOINC config variables shows that the WCG definition of this specific variable doesn't appear to jive with the what the developer intended. For example "Minimum work queue wanted. Default is 0.1 days. Maximum is 10 days. Usage, between 0.0001 and 10. " is the definition from one doc. None I saw referred to this as the min time between connection to the servers, thus why I am questioning the improper definition on the WCG site.
[Jan 2, 2007 6:53:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

No, the 1 day is just the amount of work to buffer / contacting time. The due date, as reported in the tasks tab, is for WCG 7x24 hours i.e. u will not miss deadline, lest u do not complete/return within 7 days. Of the top of my head the title on the website profile is a BOINC standard and looking at the SIMAP one it says
"Connect to network about every 1.0 days"
(determines size of work cache; maximum 10 days)

If dabbling, please only do this in the override files, not the once controlled/updated from the website! The order of reading is the that the override files are read in last for that purpose.

Respective the DCF discussion of KH'.... the number is adjusted continuously and u need to let your machine alone, truly alone, for it does take weeks for BOINC to find it's balance. Whilst WCG has only one DCF for all it's projects and there for having an inherent issue, each DC has it's own DCF. e.g. my Tanpaku DCF is 1.05, the WCG is 0.8, the SIMAP is 0.85 and Rosetta 1.01.....guess who's doing the best flop estimations ;>)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jan 2, 2007 7:13:54 PM]
[Jan 2, 2007 7:13:20 PM]   Link   Report threatening or abusive post: please login first  Go to top 
schepers
Advanced Cruncher
Canada
Joined: Oct 11, 2006
Post Count: 85
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: BOINC To Completion madness

Of the top of my head the title on the website profile is a BOINC standard and looking at the SIMAP one it says
"Connect to network about every 1.0 days" (determines size of work cache; maximum 10 days)


It seems other sources for explaining the prefs file use different definitions. I will let this one alone.

If dabbling, please only do this in the override files, not the once controlled/updated from the website! The order of reading is the that the override files are read in last for that purpose.


Always! I keep an override file handy in case I need to config something very specific.

Respective the DCF discussion of KH'.... the number is adjusted continuously and u need to let your machine alone, truly alone, for it does take weeks for BOINC to find it's balance. Whilst WCG has only one DCF for all it's projects and therefore having an inherent issue, each DC has it's own DCF. e.g. my Tanpaku DCF is 1.05, the WCG is 0.8, the SIMAP is 0.85 and Rosetta 1.01.....guess who's doing the best flop estimations.


Took me a bit to find out what DCF was! Mine is just below 1 (0.949502). This doesn't relate to my question about work buffering, does it?

Now, what's a DC (from your paragraph above)?
----------------------------------------
[Edit 3 times, last edit by schepers at Jan 2, 2007 7:58:20 PM]
[Jan 2, 2007 7:32:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 33   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread