Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 16
Posts: 16   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 15155 times and has 15 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Maintenance on server

Sites are still stucking here is there still any work on the servers now ?
Had much probles with loading the stats.....
[Jul 14, 2010 7:17:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
rwremote
Cruncher
U.K.
Joined: Aug 27, 2009
Post Count: 36
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Maintenance on server

rwremote,
After your post I also noticed my cache didn't have any extra WU's waiting to work so I temporarily increased it from 0.2 to 0.7 and got the extra work. If you are still within your window of communication, maybe you should temporarily up your cache to a day to get you through the dry spell then put it back to your usual number.


STARBASEn
Many thanks for the feedback. I upped the cache intially to 0.7 but still got no new WU's, so upped again to 1.1 days and got ONE extra WU's for each core. (One being crunched and one waiting). As the time to completion is 3 hours for the single core and 4 hours for the dual core (both running different sciences), I might just get through the communication window blackout.

However something is not correct in the allocation of WU's back in the WCG server HQ.
Thanks again for the advice.
----------------------------------------

[Jul 14, 2010 2:17:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Maintenance on server

Don't think there is much wrong at WCG HQ. Your client operates and decides based on a number of dynamic control parameters to ask for work. If it's not asking for work as you can see in the message log, then investigation is in order, the first value to check being the Duration Correction Factor. I've got my config set so it prints it out after completion of each task which includes a correction to this factor based on the last task actual versus estimate. (flops in header of task form basis for estimation with the benchmark values as a function of that calculation).

Here's the one flag <sched_op_debug> set in cc_config.xml and causes me to be informed on DCF and work fetch plus history when perusing the stdoutdae.txt file. From an older log:

11-Jun-2010 20:46:13 [World Community Grid] Computation for task X0000070450149200606261618_0 finished
11-Jun-2010 20:46:13 [World Community Grid] [dcf] DCF: 1.053681->1.059201, raw_ratio 1.108880, adj_ratio 1.052387
11-Jun-2010 20:46:15 [World Community Grid] Started upload of X0000070450149200606261618_0_0
11-Jun-2010 20:46:20 [World Community Grid] Finished upload of X0000070450149200606261618_0_0
11-Jun-2010 20:46:24 [World Community Grid] [sched_op_debug] Starting scheduler request
11-Jun-2010 20:46:24 [World Community Grid] Sending scheduler request: To report completed tasks.
11-Jun-2010 20:46:24 [World Community Grid] Reporting 1 completed tasks, not requesting new tasks
11-Jun-2010 20:46:24 [World Community Grid] [sched_op_debug] CPU work request: 0.00 seconds; 0.00 CPUs
11-Jun-2010 20:46:26 [World Community Grid] Scheduler request completed
11-Jun-2010 20:46:26 [World Community Grid] [sched_op_debug] Server version 601
11-Jun-2010 20:46:26 [World Community Grid] Project requested delay of 11 seconds

Takes out of a few things out of the guessing where the issue may be... and do keep in mind the recent uptime. That goes into the work fetch equation.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jul 14, 2010 2:45:32 PM]
[Jul 14, 2010 2:43:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
joeperry39@gmail.com
Advanced Cruncher
USA
Joined: Nov 22, 2006
Post Count: 140
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Maintenance on server

Looks like the problem is still "active" as I received this message just a few minutes ago:

7/14/2010 12:52:55 PM World Community Grid update requested by user
7/14/2010 12:52:59 PM World Community Grid Sending scheduler request: Requested by user.
7/14/2010 12:52:59 PM World Community Grid Reporting 1 completed tasks, not requesting new tasks
7/14/2010 12:53:00 PM World Community Grid Scheduler request completed
7/14/2010 12:53:00 PM World Community Grid Message from server: Project is temporarily shut down for maintenance

I've been getting this off and on for the past couple days. sad
----------------------------------------


"Everything in moderation, including moderation" -- Mark Twain
[Jul 14, 2010 5:01:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Maintenance on server

This one has a discussed cause from several weeks ago. The system will shut down parts automatically (and restart later) when momentary loads are too high, the first bit closed being the file uploading. Maybe WCG might have to contemplate after all to combine multiple HCC tasks in one same as was done with DDDT1.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jul 14, 2010 5:12:49 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JmBoullier
Former Community Advisor
Normandy - France
Joined: Jan 26, 2007
Post Count: 3716
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Maintenance on server

I had set my clients to a scheduled reconnect at 23:00 UTC (that's my 01:00am CEST) and when they did, they were still told to back off and only managed to get the last CEP2 job up at 23:40. As many would not have set a scheduled networking, backoff times will have been significant to go beyond 00:00 UTC, the effect translated into a daily stats of 218 CPU years instead of an anticipated 290. Doubt if a remaining hour would have allowed uploads worth 70+ CPU years worth of work.
Uploads could resume only about 10-15 minutes before 00:00 UTC.
And anyway validators were paused so all the jobs I could return and report in time because I was watching have not been taken for the daily stats since they were not validated.
Have no regrets. smile
----------------------------------------
Team--> Decrypthon -->Statistics/Join -->Thread
----------------------------------------
[Edit 1 times, last edit by JmBoullier at Jul 15, 2010 2:01:57 AM]
[Jul 15, 2010 2:00:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 16   Pages: 2   [ Previous Page | 1 2 ]
[ Jump to Last Post ]
Post new Thread