Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 16
Posts: 16   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1595 times and has 15 replies Next Thread
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Deferring communication for longer than normal

The last 3 times one of my boxen have requested more work, it has received the following:

6/22/2007 11:14:03 AM|World Community Grid|Sending scheduler request: To fetch work
6/22/2007 11:14:03 AM|World Community Grid|Requesting 42140 seconds of new work, and reporting 6 completed tasks
6/22/2007 11:14:18 AM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/22/2007 11:14:18 AM|World Community Grid|Deferring communication for 9 hr 12 min 49 sec
6/22/2007 11:14:18 AM|World Community Grid|Reason: requested by project

6/22/2007 8:29:41 PM|World Community Grid|Sending scheduler request: To fetch work
6/22/2007 8:29:41 PM|World Community Grid|Requesting 68880 seconds of new work
6/22/2007 8:29:46 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/22/2007 8:29:46 PM|World Community Grid|Deferring communication for 8 hr 28 min 39 sec
6/22/2007 8:29:46 PM|World Community Grid|Reason: requested by project

6/23/2007 4:58:26 AM|World Community Grid|Sending scheduler request: To fetch work
6/23/2007 4:58:26 AM|World Community Grid|Requesting 11302 seconds of new work, and reporting 6 completed tasks
6/23/2007 4:58:36 AM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/23/2007 4:58:36 AM|World Community Grid|Deferring communication for 10 hr 1 min 14 sec
6/23/2007 4:58:36 AM|World Community Grid|Reason: requested by project

This really isn't a problem as I have Connect to set at 2.0 days, and it should be easier on the database than reporting/requesting 1 task at a time. I was just wondering if anyone else is getting these longer defers instead of the usual 5 minutes 3 seconds.
[Jun 23, 2007 2:32:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

Mulling this over.... mulling .... still mulling.... ad interim: One of the reasons of the increased time is that e.g. work is not available, but more probable one of your jobs messed up the fractions, so you request work, but since BOINC thinks you're not crunching hard enough, postpones the fetch. Each time the client does that, the server tells the client to back of with an increased time span.

Places to check:

client_state.xml. What are the time fraction looking like? (Copy/Paste in post)
client_state.xml. What is the WCG sections specific Duration Correction Factor (DCF)? (Copy/Past in post)

What BOINC version are you on and if 5.10.x (Still not fit for production me thinks), have you been visiting the Advanced, Local Preferences screens?

Or, your machine after more mulling, might be in panic mode and will want to work off the task buffer queue before letting more work come across.

The summary answer.... need to see those client_state.xml file bits to form a better opinion and if you have a Global_prefs_override.xml, like to see that too.

And of course not to forget: What jobs is it crunching? GC, FA@H, HPF2, or a few BETA's?

ciao
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Jun 24, 2007 6:05:28 PM]
[Jun 23, 2007 4:24:32 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Diana G.
Master Cruncher
Joined: Apr 6, 2005
Post Count: 3003
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

Sek, my machine has done the same thing

6/22/2007 8:32:41 PM|World Community Grid|Sending scheduler request: To fetch work
6/22/2007 8:32:41 PM|World Community Grid|Requesting 16 seconds of new work, and reporting 1 completed tasks
6/22/2007 8:32:46 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/22/2007 8:32:46 PM|World Community Grid|Deferring communication for 11 hr 59 min 55 sec
6/22/2007 8:32:46 PM|World Community Grid|Reason: requested by project

and

6/23/2007 6:26:07 AM|World Community Grid|Sending scheduler request: Requested by user
6/23/2007 6:26:07 AM|World Community Grid|Requesting 55466 seconds of new work, and reporting 1 completed tasks
6/23/2007 6:26:11 AM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/23/2007 6:26:11 AM|World Community Grid|Deferring communication for 8 hr 55 min 7 sec
6/23/2007 6:26:11 AM|World Community Grid|Reason: requested by project


My BOINC is 5.8.16 for windows.

I crunch everything but HDC and HCMD. Haven't seen any panic modes or any beta Fa@h, I did opt in on that, but never crunched one yet.

client_state html:

<on_frac>0.983991</on_frac>
<connected_frac>0.989003</connected_frac>
<active_frac>0.998700</active_frac>
<cpu_efficiency>0.907524</cpu_efficiency>
<last_update>1182617242.953125</last_update>

Let me know if you need more...gotta get back to work LOL

:-)

Diana G.
----------------------------------------

[Jun 23, 2007 5:09:13 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

Fractions look very healthy Diana G.... no jab required for those, leaves the DCF to check ?!?

Meantime I've been naughty on 1 machine and will be watching if I can deceive the servers to give me the same treatment.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jun 23, 2007 5:41:52 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

Here you go, Sek:

From client_state

<time_stats>
<on_frac>0.998409</on_frac>
<connected_frac>-1.000000</connected_frac>
<active_frac>0.999711</active_frac>
<cpu_efficiency>1.001006</cpu_efficiency>
<last_update>1182629854.021124</last_update>
</time_stats>

<duration_correction_factor>1.487712

From prefs_override

<global_preferences>
<run_if_user_active>1</run_if_user_active>
<leave_apps_in_memory>1</leave_apps_in_memory>
<ram_max_used_busy_pct>100.000000</ram_max_used_busy_pct>
<ram_max_used_idle_pct>100.000000</ram_max_used_idle_pct>
</global_preferences>

Running 5.10.6 (Windows XP) but I've been running that version since the day it was posted to download page. The long deferrals started yesterday. I don't even use BOINC Manager as I use BoincView to monitor all machines, so I haven't altered any settings in the override or cc_config files.

Panic mode? It may be, but I doubt it. BV shows work buffer of 52:39:01 and WCG is the only project running on this PC. And the sub-projects are FAAH & GC only.

Looks like it happened again while I was at work:

6/23/2007 2:59:51 PM|World Community Grid|Sending scheduler request: To fetch work
6/23/2007 2:59:51 PM|World Community Grid|Requesting 45720 seconds of new work, and reporting 6 completed tasks
6/23/2007 3:00:01 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/23/2007 3:00:01 PM|World Community Grid|Deferring communication for 9 hr 2 min 23 sec
6/23/2007 3:00:01 PM|World Community Grid|Reason: requested by project
[Jun 23, 2007 8:40:18 PM]   Link   Report threatening or abusive post: please login first  Go to top 
retsof
Former Community Advisor
USA
Joined: Jul 31, 2005
Post Count: 6824
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

This really isn't a problem as I have Connect to set at 2.0 days, and it should be easier on the database than reporting/requesting 1 task at a time. I was just wondering if anyone else is getting these longer defers instead of the usual 5 minutes 3 seconds.

What does your queue look like? Is it full? Do you have enough to work on? After you get one of these and do a manual project update request, does it merely say the same thing again or download some work?

I have seen an extended situation on projects that may be out of work or never had any to begin with:
6/23/2007 11:47:01 AM|gerasim@home|Sending scheduler request: To fetch work
6/23/2007 11:47:01 AM|gerasim@home|Requesting 23039 seconds of new work
6/23/2007 11:47:06 AM|gerasim@home|Scheduler RPC succeeded [server version 707]
6/23/2007 11:47:06 AM|gerasim@home|Message from server: No work from project.
6/23/2007 11:47:06 AM|gerasim@home|Deferring communication for 22 min 24 sec
6/23/2007 11:47:06 AM|gerasim@home|Reason: no work from project


I have also seen it for CPU cores that have finished 50 workunits in a day. That reason will be in the message. The delay will be computed until 0000 UTC in that case, from what I've seen here.

Yours doesn't give a reason. That's what is a bit odd about it.
----------------------------------------
SUPPORT ADVISOR
Work+GPU i7 8700 12threads
School i7 4770 8threads
Default+GPU Ryzen 7 3700X 16threads
Ryzen 7 3800X 16 threads
Ryzen 9 3900X 24threads
Home i7 3540M 4threads50%
----------------------------------------
[Edit 4 times, last edit by retsof at Jun 24, 2007 2:28:33 AM]
[Jun 24, 2007 2:17:10 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Diana G.
Master Cruncher
Joined: Apr 6, 2005
Post Count: 3003
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

6/23/2007 4:38:26 PM|World Community Grid|Sending scheduler request: To fetch work 6/23/2007 4:38:26 PM|World Community Grid|Requesting 24 seconds of new work, and reporting 1 completed tasks
6/23/2007 4:38:30 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/23/2007 4:38:30 PM|World Community Grid|Deferring communication for 11 hr 59 min 54 sec

6/23/2007 4:38:30 PM|World Community Grid|Reason: requested by project
6/23/2007 4:38:32 PM|World Community Grid|[file_xfer] Started download of file faah1768_d212n644_x2AZ8_01_faah1768_d212n644_x2AZ8_01.dpf
6/23/2007 4:38:32 PM|World Community Grid|[file_xfer] Started download of file faah1768_d212n644_x2AZ8_01_AD4_parameters.dat
6/23/2007 4:38:34 PM|World Community Grid|[file_xfer] Finished download of file faah1768_d212n644_x2AZ8_01_faah1768_d212n644_x2AZ8_01.dpf
6/23/2007 4:38:34 PM|World Community Grid|[file_xfer] Throughput 2241 bytes/sec
6/23/2007 4:38:34 PM|World Community Grid|[file_xfer] Finished download of file faah1768_d212n644_x2AZ8_01_AD4_parameters.dat
6/23/2007 4:38:34 PM|World Community Grid|[file_xfer] Throughput 5173 bytes/sec
6/23/2007 4:38:34 PM|World Community Grid|[file_xfer] Started download of file faah1768_d212n644_x2AZ8_01_x2AZ8.pdbqt
6/23/2007 4:38:34 PM|World Community Grid|[file_xfer] Started download of file faah1768_d212n644_x2AZ8_01_d212n644_x2AZ8_01.gpf
6/23/2007 4:38:36 PM|World Community Grid|[file_xfer] Finished download of file faah1768_d212n644_x2AZ8_01_x2AZ8.pdbqt
6/23/2007 4:38:36 PM|World Community Grid|[file_xfer] Throughput 108413 bytes/sec
6/23/2007 4:38:36 PM|World Community Grid|[file_xfer] Finished download of file faah1768_d212n644_x2AZ8_01_d212n644_x2AZ8_01.gpf
6/23/2007 4:38:36 PM|World Community Grid|[file_xfer] Throughput 1230 bytes/sec
6/23/2007 4:38:36 PM|World Community Grid|[file_xfer] Started download of file faah1768_d212n644_x2AZ8_01_d212n644.pdbqt
6/23/2007 4:38:38 PM|World Community Grid|[file_xfer] Finished download of file faah1768_d212n644_x2AZ8_01_d212n644.pdbqt
6/23/2007 4:38:38 PM|World Community Grid|[file_xfer] Throughput 4666 bytes/sec
6/23/2007 9:09:01 PM|World Community Grid|Computation for task faah1762_d200n978_x2AZ8_00_1 finished
6/23/2007 9:09:01 PM|World Community Grid|Starting faah1763_d202n458_x2AZ8_00_0
6/23/2007 9:09:02 PM|World Community Grid|Starting task faah1763_d202n458_x2AZ8_00_0 using faah version 528
6/23/2007 9:09:04 PM|World Community Grid|[file_xfer] Started upload of file faah1762_d200n978_x2AZ8_00_1_0
6/23/2007 9:09:04 PM|World Community Grid|[file_xfer] Started upload of file faah1762_d200n978_x2AZ8_00_1_1
6/23/2007 9:09:07 PM|World Community Grid|[file_xfer] Finished upload of file faah1762_d200n978_x2AZ8_00_1_0
6/23/2007 9:09:07 PM|World Community Grid|[file_xfer] Throughput 19017 bytes/sec
6/23/2007 9:09:07 PM|World Community Grid|[file_xfer] Finished upload of file faah1762_d200n978_x2AZ8_00_1_1
6/23/2007 9:09:07 PM|World Community Grid|[file_xfer] Throughput 62927 bytes/sec
6/23/2007 11:25:36 PM|World Community Grid|Sending scheduler request: Requested by user 6/23/2007 11:25:36 PM|World Community Grid|Reporting 1 tasks
6/23/2007 11:25:41 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/23/2007 11:25:41 PM|World Community Grid|Deferring communication for 5 min 3 sec

6/23/2007 11:25:41 PM|World Community Grid|Reason: requested by project
6/23/2007 11:32:47 PM|World Community Grid|Sending scheduler request: To fetch work 6/23/2007 11:32:47 PM|World Community Grid|Requesting 145 seconds of new work
6/23/2007 11:32:52 PM|World Community Grid|Scheduler RPC succeeded [server version 509]
6/23/2007 11:32:52 PM|World Community Grid|Deferring communication for 11 hr 59 min 30 sec

6/23/2007 11:32:52 PM|World Community Grid|Reason: requested by project
6/23/2007 11:32:54 PM|World Community Grid|[file_xfer] Started download of file fcg1.15001160.faa
6/23/2007 11:32:54 PM|World Community Grid|[file_xfer] Started download of file fcg1.15001317.faa
6/23/2007 11:32:57 PM|World Community Grid|[file_xfer] Finished download of file fcg1.15001160.faa
6/23/2007 11:32:57 PM|World Community Grid|[file_xfer] Throughput 188036 bytes/sec
6/23/2007 11:32:57 PM|World Community Grid|[file_xfer] Finished download of file fcg1.15001317.faa
6/23/2007 11:32:57 PM|World Community Grid|[file_xfer] Throughput 303399 bytes/sec

retsof, I highlighted the requests by scheduler and the request by myself and you can see the deferring communication differences. Hmmm. Maybe you can see something?

I was wondering if it might be normal because I checked the 'Maximum Output" profile about close to a month ago, and maybe the scheduler doesn't need to communicate more than every 12 hrs.?

Thanks!! smile

Diana G.
----------------------------------------

[Jun 24, 2007 4:04:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

Diana G.,

The/Your message suggested you were not getting work, but in fact you are seeing now a bit more of the log.... phew. Now theorize that with connect to 1.00000 (always), the client tells you when the next scheduled contact is going to be.... now that I'd think is efficient. The second 11:25:36 request was manual which to my knowledge rarely pulls work, but reports results and tasks.... that per ROM cost 15 USD dollars ;P

On the naughty part, I've not been able to draw this message out. It just sends the result files up, and on the next result file upload also does the previous 'ready to report' task reporting.

I'd give it a park mode for now until a connection can be made with a specific situation.

@BobCat13's, CPU efficiency looks OCed. "<cpu_efficiency>1.001006</cpu_efficiency>", yet your DCF (Duration Correction Factor) of 1.48 suggests that either you're not crunching 24/7 or the benchmark is much higher than the real progress or there are a lot of processes eating a substantial portion of the CPU time. DCF 1.48 means that BOINC computes a job time of e.g. 8 hours, but it in fact taking on current average 12 wall-clock hours, which conflicts again with the cpu efficiency.
<cpu_efficiency>;
// The ratio between CPU time accumulated by BOINC apps
// and the wall time those apps are scheduled at the OS level.
// May be less than one if
// 1) apps page or do I/O
// 2) other CPU-intensive apps run
From the above DCF link into the Unofficial BOINC Wiki:
A Result Duration Correction Factor of 1.00 means that the computer is crunching at the rate predicted by the Benchmark.

And DCF explained in the below linked BOINC FAQ resource:
Duration Correction Factor

With this BOINC learns to estimate the "to completion time" of results more correctly, so in the end you can even download more work.
It takes on average about 2 weeks before BOINC gets in the neighborhood of correctly estimated times and even then it continues to correct times.
It works per project. Default is 1.

If your times all of a sudden sky-rocket, there's a good chance your DCF numbers are broken (very high).


Sekerob

Added: (making mental note to not to forget to ask for the "5 lines before and after" suspect log entries)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Jul 19, 2007 5:14:06 PM]
[Jun 24, 2007 7:46:34 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

Okay Boys and Girls, in the end I forced (speak BOINC gave me) the sought after message:
24-6-2007 15:54:23|World Community Grid|Sending scheduler request: To fetch work
24-6-2007 15:54:23|World Community Grid|Requesting 587441 seconds of new work
24-6-2007 15:54:28|World Community Grid|Scheduler RPC succeeded [server version 509]
24-6-2007 15:54:28|World Community Grid|Deferring communication for 21 hr 57 min 43 sec
24-6-2007 15:54:28|World Community Grid|Reason: requested by project
My analysis and suppositions: In past the user never knew for sure when the next scheduled contact with the servers would be, causing the urge in many to hit that Update button over and over again at the cited 15 USD virtual cost per pop. Now the scheduler tells you and starts as with the familiar 5 minute defer run a count down in the Projects tab. This would be the very latest the scheduler would send Result Files and 'Ready to Report" tasks up and maybe fetch more work (long as you keep on crunching). Now anyone on a strong points diet is basically told to back off and save WCG to open the various databases.

Enjoy this new feature, which given the 5.8.16 and 5.10.7 appearances is most likely a server side added boon. Chapeau to the Technicians for this great new comfort zone addition.

Sekerob
(don't you love those Sunday afternoon thumb in the air stories.... one wonders how they come about... ah of course, a good glass of Vino Abruzzese, Tollo Secco Robino)
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jun 24, 2007 3:00:26 PM]
[Jun 24, 2007 2:37:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
BobCat13
Senior Cruncher
Joined: Oct 29, 2005
Post Count: 295
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Deferring communication for longer than normal

@BobCat13's, CPU efficiency looks OCed. "<cpu_efficiency>1.001006</cpu_efficiency>", yet your DCF (Duration Correction Factor) of 1.48 suggests that either you're not crunching 24/7 or the benchmark is much higher than the real progress or there are a lot of processes eating a substantial portion of the CPU time. DCF 1.48 means that BOINC computes a job time of e.g. 8 hours, but it in fact taking on current average 12 wall-clock hours, which conflicts again with the cpu efficiency.


If all tasks on WCG were the same length, then I would agree with you about the DCF, but since task length varies you know that DCF changes back and forth for WCG. Right now, WCG's DCF is at 1.274 but as soon as I get one of those 5 hour FAAH or 1 hour GC task it will jump back to 1.4 or 1.5 most likely. Looking back at Completed tasks in BV shows I had a GC task of 58m a couple hours before posting the 1.48 DCF and most GC tasks complete in 40m or less.

The DCF for other projects on this box are:

Docking 0.6371
Riesel Sieve 1.0024
Rosetta 0.9998
Spinhenge 0.9024
Superlink 0.8814

All of those projects have tasks that don't vary in length, so it looks like this box is doing fine. Also no overclocking: Athlon X2 6000 running at stock.
[Jun 24, 2007 2:48:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 16   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread