| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 118
|
|
| Author |
|
|
vepaul
Senior Cruncher Belgium Joined: Nov 17, 2004 Post Count: 261 Status: Offline Project Badges:
|
No, it was on the same machine, as far as I can see.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
That would be an absolute very first to happen in a quorum 2 or greater distribution, that's then after over 2 billion results at wcg. The result status page, without drilling into detail distribution shows the different device names as you had both _0 and _1 copies. Since you can have multiple devices with the same name, wcg does not care, you'd then have to mouse hover over the device names to see the actual unique device id.
|
||
|
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges:
|
Got 2 on my i5-2500S Mac OS X 10.9.4 7.0.65, one already errored out:
----------------------------------------Result Name: BETA_ E225106_ 60_ S.324.C38H26N10.YFHQHCVYIOUKJP-UHFFFAOYSA-N.14_ s1_ 14_ 3-- ETA1: also the second one: Result Name: BETA_ E225106_ 259_ S.326.C41H29N5S1.TZUBUZMEZPCNHP-UHFFFAOYSA-N.1_ s1_ 14_ 9-- 3rd one is running, but I am not holding my breath. ![]() Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 ![]() [Edit 1 times, last edit by branjo at Aug 17, 2014 6:38:05 PM] |
||
|
|
vepaul
Senior Cruncher Belgium Joined: Nov 17, 2004 Post Count: 261 Status: Offline Project Badges:
|
That would be an absolute very first to happen in a quorum 2 or greater distribution, that's then after over 2 billion results at wcg. The result status page, without drilling into detail distribution shows the different device names as you had both _0 and _1 copies. Since you can have multiple devices with the same name, wcg does not care, you'd then have to mouse hover over the device names to see the actual unique device id. I have only 2 devices working, and only one uses GPU ... Why the other one does not run CEP2 is a mystery: both run on Windows7. |
||
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
We have added another 1000 workunits to the mix. These should fix the errors due to max memory being used.
Thanks, -Uplinger |
||
|
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges:
|
Got 16 of them on my Mac only - Win rig and Linux clouds went dry.
----------------------------------------![]() Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Three have just exited with lines like this in the Event Log:
BETA_E225108_59_S.328.C41H25N7O1.JFONBYKDRGSVKP-UHFFFAOYSA-N.1_s1_14_0 exited with zero status but no 'finished' file. |
||
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
Tony, what you are seeing should be results that are exiting due to not converging. These should be marked as valid going forward as the data is still useful for the researchers. You should see them go into pending validation state.
Thanks, -Uplinger |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Thanks, Keith. Things are stranger than I first thought. This is on an i5-750, 4 cores, Win7, 16GB DDR3. All 4 cores commenced Beta CEP2 units within seconds of each other (risky, I know, and possibly the cause of some of these outcomes):
18/08/2014 18:45:35 | World Community Grid | Starting task BETA_E225108_530_S.328.C44H29N5.ONXAAYOVMMZRSK-UHFFFAOYSA-N.14_s1_14_1 18/08/2014 18:45:36 | World Community Grid | Starting task BETA_E225108_588_S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.15_s1_14_0 18/08/2014 18:45:36 | World Community Grid | Starting task BETA_E225108_655_S.328.C42H26N6O1.BXDZVSAWISJAKK-UHFFFAOYSA-N.4_s1_14_0 18/08/2014 18:45:39 | World Community Grid | Starting task BETA_E225108_587_S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.14_s1_14_0 Then 2 finished somewhat early, with RC = 0x1 still in Job#0, and 2 others commenced: 18/08/2014 20:05:12 | World Community Grid | Computation for task BETA_E225108_588_S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.15_s1_14_0 finished 18/08/2014 20:05:29 | World Community Grid | Computation for task BETA_E225108_655_S.328.C42H26N6O1.BXDZVSAWISJAKK-UHFFFAOYSA-N.4_s1_14_0 finished ... uploads ... 18/08/2014 20:05:56 | World Community Grid | Starting task BETA_E225108_59_S.328.C41H25N7O1.JFONBYKDRGSVKP-UHFFFAOYSA-N.1_s1_14_0 18/08/2014 20:05:56 | World Community Grid | Starting task BETA_E225108_597_S.328.C42H26N6O1.SLVWIZBQTZWXHB-UHFFFAOYSA-N.4_s1_14_0 ... uploads ... Shortly afterwards, the 2 new units suffered this, simultaneously with another initial task finishing, again with RC=0x1 in Job#0: 18/08/2014 20:07:49 | World Community Grid | Task BETA_E225108_59_S.328.C41H25N7O1.JFONBYKDRGSVKP-UHFFFAOYSA-N.1_s1_14_0 exited with zero status but no 'finished' file 18/08/2014 20:07:49 | World Community Grid | If this happens repeatedly you may need to reset the project. 18/08/2014 20:07:49 | World Community Grid | Task BETA_E225108_597_S.328.C42H26N6O1.SLVWIZBQTZWXHB-UHFFFAOYSA-N.4_s1_14_0 exited with zero status but no 'finished' file 18/08/2014 20:07:49 | World Community Grid | If this happens repeatedly you may need to reset the project. 18/08/2014 20:07:49 | World Community Grid | Computation for task BETA_E225108_587_S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.14_s1_14_0 finished A minute later, the last of the initial units suffered this, simultaneously with another unit starting: 18/08/2014 20:08:46 | World Community Grid | Task BETA_E225108_530_S.328.C44H29N5.ONXAAYOVMMZRSK-UHFFFAOYSA-N.14_s1_14_1 exited with zero status but no 'finished' file 18/08/2014 20:08:46 | World Community Grid | If this happens repeatedly you may need to reset the project. 18/08/2014 20:08:46 | World Community Grid | Starting task BETA_E225108_842_S.328.C40F3H23N2O3.RKUKECOGVQQHGS-UHFFFAOYSA-N.15_s1_14_0 Results Status shows this: BETA_ E225108_ 587_ S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.14_ s1_ 14_ 0-- Tree3 Pending Validation 18/08/14 17:45:29 18/08/14 19:14:02 1.21 / 1.36 43.8 / 0.0 BETA_ E225108_ 588_ S.328.C42H26N6O1.JXTUBXMYSMVBOD-UHFFFAOYSA-N.15_ s1_ 14_ 0-- Tree3 Pending Validation 18/08/14 17:45:29 18/08/14 19:14:02 1.17 / 1.33 42.8 / 0.0 BETA_ E225108_ 655_ S.328.C42H26N6O1.BXDZVSAWISJAKK-UHFFFAOYSA-N.4_ s1_ 14_ 0-- Tree3 Pending Validation 18/08/14 17:45:29 18/08/14 19:14:02 1.17 / 1.33 42.8 / 0.0 What I find strange is that the system is currently processing _530_, _59_, _597_, _842_; the first 3 of those are the units that "exited with zero status but no 'finished' file", implying that they restarted. Is that normal?? So some of this may just be overloading of my system, but I've elaborated on it in case there's anything of value in reporting this behaviour. |
||
|
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges:
|
Tony,
so these work units are larger than a lot of the others. They all have over 300 electrons in them which make them harder to converge. Thus why you are seeing them exit sooner than you would expect. The exited with zero status but no finish file may be something within the science code that exits the application with status 0, which is success. But does not write the boinc_finish file. This is not critical, just a bad warning and probably has to do with the path of exiting due to not converging. Please note that I have the validator turned off right now as we await instructions from the researchers for a valid threshold for the convergence on validation. Thanks, -Uplinger |
||
|
|
|