| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 22
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There's only something like 82 workunits available in the first go-around, so not everybody's going to get an AC@H. A larger number may show up later, but most of y'all may need something else to work on while you are waiting. Is there a way of eliminating those reporting errors from the pool of devices capable of downloading AC@H WU's. I have completed my first one, and I noticed that some are reporting errors all ready. Since there are more crunchers than WU's for this project, it would make sense to use only those actually reporting without errors. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There's only something like 82 workunits available in the first go-around, so not everybody's going to get an AC@H. A larger number may show up later, but most of y'all may need something else to work on while you are waiting. Is there a way of eliminating those reporting errors from the pool of devices capable of downloading AC@H WU's. I have completed my first one, and I noticed that some are reporting errors all ready. Since there are more crunchers than WU's for this project, it would make sense to use only those actually reporting without errors. Workunit Status Project Name: AfricanClimate@Home Created: 09/02/2007 01:21:07 Name: ach1_1_4 Minimum Quorum: 10 Initial Replication: 10 The large number of copies sent out for this workunit is due to the unique nature of this project. We encourage you to read the FAQs about this project for more information. Result Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit ach1_ 1_ 4_ 11-- In Progress 09/02/2007 12:39:03 09/03/2007 06:39:03 0.00 0.0 / 0.0 ach1_ 1_ 4_ 10-- In Progress 09/02/2007 01:57:29 09/03/2007 09:11:06 0.00 0.0 / 0.0 ach1_1_4_9 In Progress 09/02/2007 01:44:16 09/07/2007 01:44:16 0.00 0.0 / 0.0 ach1_1_4_6 Error 09/02/2007 01:42:41 09/02/2007 12:35:49 7.57 107.9 / 0.0 ach1_1_4_2 In Progress 09/02/2007 01:38:27 09/07/2007 01:38:27 0.00 0.0 / 0.0 ach1_1_4_1 Pending Validation 09/02/2007 01:38:11 09/02/2007 09:56:55 7.98 114.6 / 0.0 ach1_1_4_0 Pending Validation 09/02/2007 01:36:36 09/02/2007 13:18:56 7.23 85.7 / 0.0 ach1_1_4_7 Error 09/02/2007 01:34:38 09/02/2007 01:47:31 0.00 0.0 / 0.0 ach1_1_4_4 Pending Validation 09/02/2007 01:24:50 09/02/2007 14:58:04 5.21 100.4 / 0.0 ach1_1_4_8 In Progress 09/02/2007 01:24:39 09/07/2007 01:24:39 0.00 0.0 / 0.0 ach1_1_4_3 In Progress 09/02/2007 01:23:23 09/07/2007 01:23:23 0.00 0.0 / 0.0 ach1_1_4_5 In Progress 09/02/2007 01:21:58 09/07/2007 01:21:58 0.00 0.0 / 0.0 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Many of the reported errors are download problems not client problems. There is another thread which discusses this (configuration issue which has been diagnosed, and should be fixed when the next batch of models is generated). In any case, if a model returns an error, it is immediately put back in the pool of available models. [Edit 1 times, last edit by Former Member at Sep 2, 2007 4:21:57 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
There's only something like 82 workunits available in the first go-around, so not everybody's going to get an AC@H. A larger number may show up later, but most of y'all may need something else to work on while you are waiting. Is there a way of eliminating those reporting errors from the pool of devices capable of downloading AC@H WU's. I have completed my first one, and I noticed that some are reporting errors all ready. Since there are more crunchers than WU's for this project, it would make sense to use only those actually reporting without errors. Workunit Status Project Name: AfricanClimate@Home Created: 09/02/2007 01:21:07 Name: ach1_1_4 Minimum Quorum: 10 Initial Replication: 10 The large number of copies sent out for this workunit is due to the unique nature of this project. We encourage you to read the FAQs about this project for more information. Result Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit ach1_ 1_ 4_ 11-- In Progress 09/02/2007 12:39:03 09/03/2007 06:39:03 0.00 0.0 / 0.0 ach1_ 1_ 4_ 10-- In Progress 09/02/2007 01:57:29 09/03/2007 09:11:06 0.00 0.0 / 0.0 ach1_1_4_9 In Progress 09/02/2007 01:44:16 09/07/2007 01:44:16 0.00 0.0 / 0.0 ach1_1_4_6 Error 09/02/2007 01:42:41 09/02/2007 12:35:49 7.57 107.9 / 0.0 ach1_1_4_2 In Progress 09/02/2007 01:38:27 09/07/2007 01:38:27 0.00 0.0 / 0.0 ach1_1_4_1 Pending Validation 09/02/2007 01:38:11 09/02/2007 09:56:55 7.98 114.6 / 0.0 ach1_1_4_0 Pending Validation 09/02/2007 01:36:36 09/02/2007 13:18:56 7.23 85.7 / 0.0 ach1_1_4_7 Error 09/02/2007 01:34:38 09/02/2007 01:47:31 0.00 0.0 / 0.0 ach1_1_4_4 Pending Validation 09/02/2007 01:24:50 09/02/2007 14:58:04 5.21 100.4 / 0.0 ach1_1_4_8 In Progress 09/02/2007 01:24:39 09/07/2007 01:24:39 0.00 0.0 / 0.0 ach1_1_4_3 In Progress 09/02/2007 01:23:23 09/07/2007 01:23:23 0.00 0.0 / 0.0 ach1_1_4_5 In Progress 09/02/2007 01:21:58 09/07/2007 01:21:58 0.00 0.0 / 0.0 Errors indeed, we are in the same boat, I did ach1_1_4_1. ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Why are there so many copies of a work unit sent out for AfricanClimate@Home? Each computer that receives a work unit for AfricanClimate@Home will compute a two week period for the climate model based on the same starting conditions as other computers that receive a copy of the same work unit. The result data for AfricanClimate@Home is very large (greater then 100MB). Very few computers are able to return a result of this size. Therefore the result file is divided between each computer computing the work unit and each returns a unique section of the result file. Additional information is returned as well to ensure that the section of the result file returned is correct. Given that only part of the output file is returned, with what I presume is a checksum of the other parts, are they expecting the output files to be bit-reproducible? That's a very tall order if there's a mix of Intel + AMD, not to mention different builds on Linux, Win32 and Mac (The intel Fortran compiler can use different optimisations on different platforms even if identical optimisation settings are selected). |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Learn all about homogeneous redundancy.... WCG are well aware of these issues.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Out of 17 results returned, it hasn't found a quorum of 10 which match yet? (Admittedly this is the Beta rather than the Live version, so I guess it might not be unexpected).
----------------------------------------Result Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit BETA_ ach1_ 1_ 39_ 61-- Inconclusive 09/02/2007 16:11:46 09/02/2007 22:00:05 5.78 101.8 / 0.0 BETA_ ach1_ 1_ 39_ 59-- Inconclusive 09/02/2007 09:46:45 09/02/2007 15:49:26 5.92 97.9 / 0.0 BETA_ ach1_ 1_ 39_ 51-- No Reply 09/02/2007 09:32:48 09/02/2007 14:56:48 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 39_ 49-- Inconclusive 09/02/2007 03:08:57 09/02/2007 09:25:37 6.09 92.5 / 0.0 BETA_ ach1_ 1_ 39_ 39-- Inconclusive 09/01/2007 12:57:25 09/02/2007 11:51:41 7.44 132.1 / 0.0 BETA_ ach1_ 1_ 39_ 29-- Error 09/01/2007 09:38:20 09/01/2007 10:17:51 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 39_ 21-- Error 09/01/2007 08:10:43 09/02/2007 07:47:50 11.52 81.1 / 0.0 BETA_ ach1_ 1_ 39_ 17-- Inconclusive 09/01/2007 01:20:34 09/02/2007 07:22:16 5.54 67.6 / 0.0 BETA_ ach1_ 1_ 39_ 16-- Inconclusive 08/31/2007 21:48:55 09/01/2007 03:59:01 6.01 64.2 / 0.0 BETA_ ach1_ 1_ 39_ 15-- No Reply 08/31/2007 19:42:19 09/01/2007 01:06:19 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 39_ 14-- Inconclusive 08/31/2007 16:22:26 09/01/2007 07:54:35 9.06 100.3 / 0.0 BETA_ ach1_ 1_ 39_ 13-- Error 08/31/2007 15:43:38 08/31/2007 16:18:29 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 39_ 12-- Inconclusive 08/31/2007 13:42:49 09/01/2007 11:33:14 21.01 102.0 / 0.0 BETA_ ach1_ 1_ 39_ 6-- Inconclusive 08/31/2007 12:04:41 09/01/2007 00:20:43 9.45 91.8 / 0.0 BETA_ ach1_ 1_ 39_ 4-- Inconclusive 08/31/2007 10:24:27 08/31/2007 22:27:34 7.84 64.7 / 0.0 BETA_ ach1_ 1_ 39_ 3-- Inconclusive 08/31/2007 09:47:00 08/31/2007 21:47:06 6.42 93.9 / 0.0 BETA_ ach1_ 1_ 39_ 1-- Error 08/31/2007 09:42:34 08/31/2007 15:36:50 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 39_ 11-- Inconclusive 08/31/2007 08:17:02 09/02/2007 02:03:29 6.85 79.3 / 0.0 BETA_ ach1_ 1_ 39_ 0-- Inconclusive 08/31/2007 06:43:45 08/31/2007 16:05:41 8.11 85.9 / 0.0 BETA_ ach1_ 1_ 39_ 10-- No Reply 08/31/2007 01:55:41 08/31/2007 07:19:41 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 39_ 7-- Inconclusive 08/31/2007 00:48:36 08/31/2007 08:28:52 7.40 97.1 / 0.0 BETA_ ach1_ 1_ 39_ 9-- Error 08/30/2007 23:39:01 08/31/2007 00:03:20 0.23 1.3 / 0.0 BETA_ ach1_ 1_ 39_ 2-- Inconclusive 08/30/2007 21:06:02 08/31/2007 05:28:26 7.03 138.7 / 0.0 BETA_ ach1_ 1_ 39_ 5-- Inconclusive 08/30/2007 20:10:10 08/31/2007 04:23:24 6.90 83.7 / 0.0 BETA_ ach1_ 1_ 39_ 8-- Inconclusive 08/30/2007 19:48:55 08/31/2007 12:32:55 9.23 123.8 / 0.0 --- Edit: On a different work unit it's now showing 'too late' status on the model I returned, despite having returned it well within the deadline. Originally it was showing 'Inconclusive' as in the above example. Result Name Status Sent Time Time Due / Return Time CPU Time (hours) Claimed/ Granted BOINC Credit BETA_ ach1_ 1_ 38_ 79-- In Progress 09/03/2007 05:59:24 09/03/2007 11:23:24 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 72-- In Progress 09/03/2007 04:18:13 09/03/2007 09:42:13 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 84-- In Progress 09/03/2007 03:49:32 09/04/2007 13:34:35 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 70-- No Reply 09/03/2007 01:49:52 09/03/2007 07:13:52 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 66-- No Reply 09/02/2007 23:45:56 09/03/2007 05:09:56 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 67-- No Reply 09/02/2007 22:36:45 09/03/2007 04:00:45 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 68-- Too Late 09/02/2007 21:55:58 09/03/2007 07:50:13 9.47 70.0 / 0.0 BETA_ ach1_ 1_ 38_ 65-- No Reply 09/02/2007 19:44:09 09/03/2007 01:08:09 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 69-- Error 09/02/2007 19:00:30 09/02/2007 19:03:29 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 64-- Too Late 09/02/2007 18:52:46 09/03/2007 00:58:45 6.03 64.6 / 0.0 BETA_ ach1_ 1_ 38_ 63-- Too Late 09/02/2007 18:45:01 09/03/2007 07:09:11 9.20 84.6 / 0.0 BETA_ ach1_ 1_ 38_ 61-- Too Late 09/02/2007 18:39:56 09/03/2007 03:17:03 8.13 77.5 / 0.0 BETA_ ach1_ 1_ 38_ 60-- Too Late 09/02/2007 18:31:17 09/03/2007 00:19:48 5.78 103.7 / 0.0 BETA_ ach1_ 1_ 38_ 52-- Too Late 09/02/2007 17:31:44 09/03/2007 04:54:30 6.71 66.8 / 0.0 BETA_ ach1_ 1_ 38_ 51-- Too Late 09/02/2007 11:51:25 09/02/2007 21:04:52 8.74 78.8 / 0.0 BETA_ ach1_ 1_ 38_ 48-- No Reply 09/02/2007 11:46:32 09/02/2007 17:10:32 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 38_ 38-- Too Late 09/02/2007 04:45:51 09/02/2007 11:26:05 6.54 83.5 / 0.0 BETA_ ach1_ 1_ 38_ 41-- Too Late 09/02/2007 04:32:17 09/02/2007 11:27:02 6.75 81.8 / 0.0 BETA_ ach1_ 1_ 38_ 33-- Too Late 09/02/2007 04:27:00 09/02/2007 18:10:06 11.71 74.3 / 0.0 BETA_ ach1_ 1_ 38_ 28-- Too Late 09/01/2007 12:02:18 09/02/2007 02:25:46 5.92 97.8 / 0.0 BETA_ ach1_ 1_ 38_ 21-- Too Late 09/01/2007 11:51:30 09/02/2007 03:14:08 5.99 90.9 / 0.0 BETA_ ach1_ 1_ 38_ 7-- Too Late 08/31/2007 11:43:11 09/01/2007 11:26:07 13.74 122.6 / 0.0 BETA_ ach1_ 1_ 38_ 6-- Too Late 08/31/2007 11:34:46 08/31/2007 23:35:13 10.07 71.6 / 0.0 BETA_ ach1_ 1_ 38_ 3-- Too Late 08/31/2007 11:09:08 09/02/2007 07:22:16 5.54 67.7 / 0.0 BETA_ ach1_ 1_ 38_ 5-- Too Late 08/31/2007 09:47:00 08/31/2007 21:47:06 6.42 94.0 / 0.0 16 replications returned so far, and more in progress. I'd be happier if the replication was part of an ensemble rather than part of a quorum. Ensembles are used widely in the climate modelling world to provide probability estimates of different outcomes. I'm already familiar with homogenous redundancy, thanks, but I can't think of any other reasons for it failing to reach a quorum with so many results returned. Beta issue? [Edit 2 times, last edit by Former Member at Sep 3, 2007 8:15:18 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Out of 17 results returned, it hasn't found a quorum of 10 which match yet? (Admittedly this is the Beta rather than the Live version, so I guess it might not be unexpected). --- Edit: On a different work unit it's now showing 'too late' status on the model I returned, despite having returned it well within the deadline. Originally it was showing 'Inconclusive' as in the above example. 16 replications returned so far, and more in progress. I'd be happier if the replication was part of an ensemble rather than part of a quorum. Ensembles are used widely in the climate modelling world to provide probability estimates of different outcomes. The workunits my host crunched encountered the same situation. I think the configuration of quorum must be wrong. Yet, there are workunits like this: Project Name: Beta Testing 2 Created: 08/30/2007 19:07:23 Name: BETA_ach1_1_33 Minimum Quorum: 10 Initial Replication: 15 BETA_ ach1_ 1_ 33_ 59-- Valid 09/02/2007 22:37:15 09/03/2007 04:50:42 6.02 63.3 / 87.1 BETA_ ach1_ 1_ 33_ 49-- Error 09/02/2007 20:44:52 09/02/2007 20:52:53 0.00 0.0 / 0.0 BETA_ ach1_ 1_ 33_ 40-- Invalid 09/02/2007 15:44:16 09/02/2007 21:45:40 5.92 97.9 / 43.6 BETA_ ach1_ 1_ 33_ 39-- Invalid 09/02/2007 10:16:32 09/02/2007 20:39:08 9.99 70.9 / 43.6 BETA_ ach1_ 1_ 33_ 29-- Error 09/02/2007 06:49:43 09/02/2007 08:06:45 1.01 14.6 / 0.0 BETA_ ach1_ 1_ 33_ 20-- Invalid 09/02/2007 03:41:04 09/02/2007 09:59:21 6.10 67.8 / 43.6 BETA_ ach1_ 1_ 33_ 19-- Invalid 09/01/2007 04:44:36 09/02/2007 05:55:47 7.86 63.7 / 43.6 BETA_ ach1_ 1_ 33_ 5-- Valid 08/31/2007 03:40:23 08/31/2007 09:36:05 4.83 101.7 / 87.1 BETA_ ach1_ 1_ 33_ 4-- Valid 08/31/2007 03:20:11 08/31/2007 15:20:14 6.43 94.8 / 87.1 BETA_ ach1_ 1_ 33_ 3-- Valid 08/31/2007 03:04:38 09/01/2007 04:34:28 24.29 148.8 / 87.1 BETA_ ach1_ 1_ 33_ 2-- Valid 08/31/2007 02:59:24 08/31/2007 16:49:19 8.04 68.6 / 87.1 BETA_ ach1_ 1_ 33_ 7-- Valid 08/31/2007 02:50:44 08/31/2007 15:52:41 12.71 74.6 / 87.1 BETA_ ach1_ 1_ 33_ 1-- Valid 08/31/2007 02:45:31 08/31/2007 11:52:05 6.04 92.0 / 87.1 BETA_ ach1_ 1_ 33_ 8-- Valid 08/31/2007 02:36:30 08/31/2007 11:30:46 6.45 81.0 / 87.1 BETA_ ach1_ 1_ 33_ 0-- Valid 08/30/2007 23:54:02 08/31/2007 07:49:51 7.42 97.4 / 87.1 BETA_ ach1_ 1_ 33_ 10-- Invalid 08/30/2007 22:36:22 08/31/2007 10:44:09 6.03 65.4 / 43.6 BETA_ ach1_ 1_ 33_ 9-- Error 08/30/2007 22:20:12 08/30/2007 22:35:55 0.23 1.3 / 0.0 BETA_ ach1_ 1_ 33_ 6-- Valid 08/30/2007 19:33:33 08/31/2007 10:52:16 9.29 127.4 / 87.1 Some are validated, but the others not, but 10 valid tasks were returned, enough to form a quorum. Checking the value of credits claimed and granted, credits are given to not only valid but also invalid ones, although the amount is reduced to the half. The 2nd from the top is mine and I aborted it since I found the deadline of the workunit was too early to be completed, and probably the 2nd from the bottom is same. I'm not sure why the tasks aren't inconclusive or too late. What is the policy of validating and granting credits? thanks, suguruhirahara |
||
|
|
rebirther
Cruncher Germany Joined: Nov 19, 2005 Post Count: 29 Status: Offline Project Badges:
|
Finished my first WU. Here are my comments about the issues I have seen:
stderr.txt: Failed to get VersionInfo size: 1812 World Community Grid AutoDock (projects/www.worldcommunitygrid.org/wcg_acah_wrf_5.09_windows_intelx86) version INFO: No state to restore. Start from the beginning. ERROR: Restoring checkpoint failed. Unable to restore state! Start_year/Start_Month/Start_Day::Start_Hour:Start_Minute:Start_Second Restart2002/12/18::0:0:0 0 At start of the WU I can display the graphic. After some switches to other projects the window took a while to load then a black screen is dislayed. After uploading files BOINC (5.8.15) stucks a little bit. In slot folder there are still 230MB that were not deleted. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Hi,
----------------------------------------certain warning/error messages are always there like the "VersionInfo size: 1812" regardless the WCG project and some are there and will always be there, never being changed to showing a warning notice, simply left by the programmers as reminder. Cant say if they will be removed (made none printing) in a future revision. As for the graphics, flopping around and returning to a graphics screen can cause for them to only show a black frame. Just give it some time to let your graphics card catch up. Some members have reported it taking several minutes for e.g. the HPF2 project. Experimented last week and changed the video card memory allocation from 64 to 128 to 256mb. At that hi amount the card is able to 'remember' more screens and after had markedly better overall system performance. Are you saying that the result has been successfully uploaded and no record shows in the Tasks tab of BOINC manager like "Ready To Report". BOINC should clean particularly the slot0, slot1 etc after a job is finished, so it can be used again for the next job. 5.10.x which should be approved by WCG in near future, has better housekeeping, where the slot cleaning concerns.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
|