| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 149
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello Questar,
Yes, knreed will run this script again. He is taking a week long vacation, so don't expect it immediately. If your computer is very slow, you might want to abort any work units you have not started yet. Lawrence |
||
|
|
bieberj
Senior Cruncher United States Joined: Dec 2, 2004 Post Count: 406 Status: Offline Project Badges:
|
I thought that recent jobs had its timeout adjusted by the server and it appears that the adjustment didn't work. I was told that the adjustment was done while I was running my second long task. At that time one of the tasks has aborted, the second one was still running and mine was running as a make-up job for the one that got aborted.
After my job was done, the other initial task timed out, so another computer was assigned to fill in and that computer timed out. Which implies that the adjustment didn't work. faah5015_ 1htg_ 1qbr_ 00_ 4-- In Progress ......... 08/09/2008 03:56:07 08/15/2008 06:20:23 0.00 0.0 / 0.0 faah5015_ 1htg_ 1qbr_ 00_ 3-- Error .................. 08/07/2008 08:31:31 08/09/2008 03:43:19 35.24 596.2 / 0.0 faah5015_ 1htg_ 1qbr_ 00_ 2-- Pending Validation 08/02/2008 15:04:22 08/06/2008 01:54:39 56.87 655.7 / 0.0 faah5015_ 1htg_ 1qbr_ 00_ 0-- Error .................. 07/30/2008 07:55:47 08/07/2008 07:48:30 50.45 837.0 / 837.0 faah5015_ 1htg_ 1qbr_ 00_ 1-- Error .................. 07/30/2008 07:41:31 08/02/2008 14:49:55 0.70 3.1 / 0.0 |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Did you follow the instructions of knreed earlier in this thread how to check the amount of timeout fpops and how to increase them (when the client is not running)?
----------------------------------------[Added: Forced a few jobs to the front and present shorter show a factor 10 ( <rsc_fpops_bound> divided by <rsc_fpops_est> ): <workunit> <name>faah4297_indazoleSO3H_MIN_xmd01130_01</name> <app_name>faah</app_name> <version_num>605</version_num> <rsc_fpops_est>24713681743955.000000</rsc_fpops_est> <rsc_fpops_bound>247136817439550.000000</rsc_fpops_bound> <rsc_memory_bound>125000000.000000</rsc_memory_bound> <rsc_disk_bound>209715200.000000</rsc_disk_bound>
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Aug 10, 2008 2:17:50 PM] |
||
|
|
Dan60
Senior Cruncher Brazil Joined: Mar 29, 2006 Post Count: 185 Status: Offline Project Badges:
|
I've gotten this faah5011_1gnm_1hps_00_4 which has been being processed for 29:00:00 and still will keep on for some 7:50:50. I might not get credits for it, but it isn't the reason why I'm into FightAIDS@Home, so it will be delivered today (I hope so), one day overdue.
best regards |
||
|
|
bieberj
Senior Cruncher United States Joined: Dec 2, 2004 Post Count: 406 Status: Offline Project Badges:
|
Yes I did Sekerob. Mine is the one that is pending validation.
|
||
|
|
bieberj
Senior Cruncher United States Joined: Dec 2, 2004 Post Count: 406 Status: Offline Project Badges:
|
Finally it gets validated. :) 648 credit for it.
|
||
|
|
E165852
Cruncher Joined: Jul 17, 2008 Post Count: 4 Status: Offline |
I have cought faah5013_6upj_1izi_00 It may run about 16-17 hours. Hum, better than the last one which has spent more than 34h and still in verification status. However, the mini monster I cought today, a guy already has completed the computation, I will eat the rest of part up.
It seems a considerable number of monsters has released around the end of July and now timed out WU are re-distributed again. How long will it last....? ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi E165852,
Yes, there have been a lot of posts by knreed et al on the batch of overly large work units that was produced by a mistake in the script that sizes work units. The estimate (a week ago) was that we would cover them all about the 18th so at that point we could make up any error units. We'll just have to see how long that takes. Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Would it be a good thing if you could find a way to schedule these extra large WU's only to systems that are running fairly quick CPU speeds? Yes - there are actually a lot of advantages to doing this. We have been working with David Anderson and BOINC to get this capability added. David has done a lot of work on this already and the folks at Superlink@Technion! are the first BOINC project to put the new code into production. We will be updating our servers to utilize the new code later this year. Once we have the code, the server will assess the 'effective power' of the computer requesting work and try to send it work that won't take it more than a day or so. Effective power is the raw power of the computer * the amount of time that BOINC is allowed to run work on the computer. Once we have tested this and feel good about it, we will modify how we create workunits so that there is a lot of variation in the size and computers will be able to get the appropriate size of work. This will reduce load on our servers as we will be able to send bigger workunits to those powerful always on computers and it will improve our ability to effectively use those computers that are less powerful and are only on infrequently (and thus have a hard time completing work currently). So it is a definite advantage to do this and we are anxious to get this in place. Knreed, Ok.. so I've been turning the algorithm around in my head for a while and I'm wondering if the scheduler / dispatcher code will take into consideration the queue depth a particular client already has prior the dispatch of a WU to a client. As I understand things, one of the values used help determine how reliable a client is perceived is how long (total wall clock time) it takes for a WU to be returned from the time of dispatch. So (in my simple mind) even if a client exists that has a very high clock speed; having a queue depth of WU's pending to execute should have some kind of effect on how that client is perceived as being reliable. Why? Well, a WU that is dispatched to a client that has a queue depth of say 10 days, will likely return the WU Sometime around the 10th day. Of course this is perfectly fine because the WU completes and is returned withing the maximum time period. Additionally, the credits are the same as if the WU was returned within say the total CPU time + 12 minutes. But, the client that returns the WU within the total CPU + 12 minutes is more reliable simply because the WU is indeed returned more expeditiously. Wadaya say? N' thanks in advance for your consideration. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Barney, the logic used to determine a "reliable" host is independent of the logic that will be used to determine the best size of work unit to send.
For rush jobs sent to "reliable" hosts, the turn-around time is important. For normal tasks, this is not a consideration. |
||
|
|
|