Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: New Beta Test starting Oct 31, 2013 [Issues Thread] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 211
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
First Beta Valid at 1.75 hours. Second Beta pending validation at 5.53 hours. Third Beta error 131 after 3.87 hours. The 2 earlier replications of this work unit had error 131 at 0.96 and 2.35 hours. I am sure that the 0.96 hour error was much earlier in the work unit than my error.
Now I am working on my 4th work unit. Obviously this will not be the last Beta. |
||
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8976 Status: Offline Project Badges: |
BETA_ BETA_ 9999988_ 0196_ 1-- Coltrane Error 10/31/13 07:20:21 11/1/13 03:22:13 3.50 / 4.14 115.1 / 0.0
----------------------------------------Result Log Result Name: BETA_ BETA_ 9999988_ 0196_ 1-- <core_client_version>7.0.64</core_client_version> <![CDATA[ <stderr_txt> Commandline = projects/www.worldcommunitygrid.org/wcgrid_beta17_7.19_windows_x86_64 -SettingsFile BETA_9999988_0196.txt -DatabaseFile dataset-GDS2771-v1.txt Initializing wcg_learn_limit = 100000 Running [16:14:48]: Computing pass 0 [16:15:09]: Computing pass 1 [16:15:31]: Computing pass 2 [16:15:51]: Computing pass 3 [16:16:12]: Computing pass 4 [16:16:33]: Computing pass 5 [16:16:55]: Computing pass 6 [16:17:15]: Computing pass 7 [16:17:36]: Computing pass 8 [16:17:58]: Computing pass 9 [16:18:22]: Computing pass 10 [16:18:58]: Computing pass 11 [16:19:19]: Computing pass 12 [16:19:41]: Computing pass 13 . . . [20:17:31]: Computing pass 610 [20:17:52]: Computing pass 611 [20:18:28]: Computing pass 612 [20:19:20]: Computing pass 613 [20:19:44]: Computing pass 614 [20:20:05]: Computing pass 615 Run complete, CPU time: 12609.046027 20:20:35 (9980): called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>BETA_BETA_9999988_0196_1_0</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ]]> |
||
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
Had 9ish download and I didn't notice and start them till 20ish hours after release. One finished fine, the rest are plugging along with a smooth distribution of run times. So far, so good.
----------------------------------------Only one repair so far a _4 unit so it will be interesting to see if it completes as scheduled. Distributed computing volunteer since September 27, 2000 |
||
|
RichSavarie
Cruncher Canada Joined: Aug 9, 2005 Post Count: 49 Status: Offline Project Badges: |
I've had one Beta WU for nearly a day and I just noticed in the log that it has been restarting itself every ~10mins or so. Each time it restarts, the "estimated completion time" resets to about 10hrs. Absolutely no progress has been made. Do I abort or just let it go?
----------------------------------------[Edit 1 times, last edit by RichSavarie at Nov 1, 2013 5:35:06 AM] |
||
|
yoro42
Ace Cruncher United States Joined: Feb 19, 2011 Post Count: 8976 Status: Offline Project Badges: |
BETA_ BETA_ 9999986_ 0187_ 2-- Gato Error 10/31/13 14:46:21 11/1/13 05:30:33 5.65 / 6.27 87.2 / 0.0
----------------------------------------Result Log Result Name: BETA_ BETA_ 9999986_ 0187_ 2-- <core_client_version>7.0.64</core_client_version> <![CDATA[ <stderr_txt> mputing pass 4128 [20:21:45]: Computing pass 4129 [20:21:49]: Computing pass 4130 [20:21:52]: Computing pass 4131 [20:21:56]: Computing pass 4132 [20:21:59]: Computing pass 4133 [20:22:02]: Computing pass 4134 [20:22:06]: Computing pass 4135 [20:22:09]: Computing pass 4136 . . . 22:26:30]: Computing pass 6063 [22:26:33]: Computing pass 6064 [22:26:37]: Computing pass 6065 [22:26:40]: Computing pass 6066 [22:26:44]: Computing pass 6067 [22:26:47]: Computing pass 6068 [22:26:51]: Computing pass 6069 [22:26:54]: Computing pass 6070 [22:26:58]: Computing pass 6071 [22:27:01]: Computing pass 6072 [22:27:05]: Computing pass 6073 [22:27:08]: Computing pass 6074 [22:27:12]: Computing pass 6075 [22:27:15]: Computing pass 6076 [22:27:19]: Computing pass 6077 [22:27:22]: Computing pass 6078 [22:27:26]: Computing pass 6079 [22:27:30]: Computing pass 6080 Run complete, CPU time: 20338.677175 22:28:25 (5288): called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>BETA_BETA_9999986_0187_2_0</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ]]> |
||
|
Mamajuanauk
Master Cruncher United Kingdom Joined: Dec 15, 2012 Post Count: 1900 Status: Offline Project Badges: |
First Beta Valid at 1.75 hours. Second Beta pending validation at 5.53 hours. Third Beta error 131 after 3.87 hours. The 2 earlier replications of this work unit had error 131 at 0.96 and 2.35 hours. I am sure that the 0.96 hour error was much earlier in the work unit than my error. Clearly the last line from Lawrence is likely to be true...Now I am working on my 4th work unit. Obviously this will not be the last Beta. Of the 16 wu's I've received, on 2 machines the following breakdown or results have been returned:
In Progress - 2 Error - 6 PV Jail - 3 Win7/64/FX8350/32Gb Ubuntu 12.03/64/FX8350/16Gb The errors appear to be equally balanced across both machines. other postings define the errors.
Mamajuanauk is the Name! Crunching is the Game!
|
||
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges: |
3* 131 errors:
----------------------------------------BETA_BETA_9999985_0580_2-- BETA_ BETA_ 9999986_ 0789_ 0-- BETA_ BETA_ 9999985_ 0196_ 1-- expect the same on BETA_ BETA_ 9999984_ 0769_ 4- if 3 others had it BETA_BETA_9999986_0692 and BETA_9999986_0234 in progress @ 11 plus hours 12* valid |
||
|
centriphugul
Cruncher Joined: Aug 16, 2013 Post Count: 1 Status: Offline Project Badges: |
BETA_BETA_9999986_0601_0 using beta17 version 719 restarted itself 5 times and now every time it elapses to 00:01:14 it reverts back to 00:00:01.
----------------------------------------Edit: 6 times it has restarted at the time of writing this post. [Edit 1 times, last edit by centriphugul at Nov 1, 2013 8:18:14 AM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1316 Status: Offline Project Badges: |
Hi Keith, I oversaw your request, but meanwhile continued my report about that task here -> http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=436360 Cheers, CP Thought you wrote earlier that you'd manipulated the bound values, then restarting the task to see how for it would go... 10x? Good morning Rob, Because the wingman's error was "Maximum disk usage exceeded", I increased the rsc_disk_bound with a factor 10. Before the suspend the checkpointo file was ~1.6GB and after the restart ~1.3GB. Memory usage before the suspend on the left and on the right the drop after the suspend: |
||
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: |
Unfortunately there is nothing you can do to fix this. This is a setting from the server. We have limited the result files to 10MB, but as some have reported the result file has grown to 100MB in some cases. The researchers have a list of results that have this issue and will be looking into getting this file size down. Thanks, -Uplinger I don't know if this limit is for the benefit of the users, or WCG. But I have a fast (2 Mbps) upload speed, and could easily do the 100 MB if that will help the science. You could make it user-selectable, like the number of tasks downloaded for CEP2. |
||
|
|