Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: New Beta Test starting Oct 31, 2013 [Issues Thread] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 211
|
Author |
|
ca05065
Senior Cruncher Joined: Dec 4, 2007 Post Count: 325 Status: Offline Project Badges: |
I had two beta work units and they were very different. The one that worked had only 13 computer passes. The one which failed with the error code 131 had over 5000 when I looked at stderr in the slots directory but only reported 1981 computer passes in results status file.
|
||
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 721 Status: Offline Project Badges: |
Completed with error.
----------------------------------------[QUOTE]Run complete, CPU time: 22246.801971 00:04:30 (19197): called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>BETA_BETA_9999986_0565_1_0</file_name> <error_code>-131</error_code> </file_xfer_error> </message>[/QUOTE] Currently being moderated under false pretences |
||
|
ski1939
Senior Cruncher Freeport, Illinois, USA Joined: Nov 17, 2004 Post Count: 209 Status: Offline Project Badges: |
run complete then xfer file error code 131
----------------------------------------(e.g. results file size 45.22mb exceeds 10mb limit) 26-10-2013 13:06 Starting BOINC client version 6.10.58 for windows_x86_64 26-10-2013 13:07 Processor: 2 GenuineIntel Pentium(R) Dual-Core CPU T4300 @ 2.10GHz [Family 6 Model 23 Stepping 10] 26-10-2013 13:07 Processor: 1.00 MB cache 26-10-2013 13:07 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 nx lm tm2 pbe 26-10-2013 13:07 OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00) 26-10-2013 13:07 Memory: 2.96 GB physical, 5.92 GB virtual 26-10-2013 13:07 Disk: 465.66 GB total, 133.56 GB free - - - - - - - - - - - - - - - - - - 31-10-2013 14:57 Computation for task BETA_BETA_9999988_0674_1 finished 31-10-2013 14:57 Output file BETA_BETA_9999988_0674_1_0 for task BETA_BETA_9999988_0674_1 exceeds size limit. 31-10-2013 14:57 File size: 47425541.000000 bytes. Limit: 10485760.000000 bytes - - - - - - - - - - - - - - - - - - [14:54:57]: Computing pass 7822 Run complete, CPU time: 18133.072637 14:57:03 (7260): called boinc_finish <file_xfer_error> <file_name>BETA_BETA_9999988_0674_1_0</file_name> <error_code>-131</error_code> </file_xfer_error> - - - - - - - - - - - - - - - - - - |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I had two beta work units and they were very different. The one that worked had only 13 computer passes. The one which failed with the error code 131 had over 5000 when I looked at stderr in the slots directory but only reported 1981 computer passes in results status file. Think uplinger or armstrdj said that in case the file is too big, it will send some [useful] portion of the result back... something along that line.edit: The slots content is anyway interesting, such as the base science app only 1K in size, multiple soft-link files that link to bigger ones on the project folder, a log file that details CPU and Lapse time and when checkpointed and more. E.g. The boinc_task_state.xml file of a task that just would not budge past .5% until near 4 hours into the job, then jumping to 50%, implying there will only be 2 passes in this job: <active_task> <project_master_url>http://www.worldcommunitygrid.org/</project_master_url> <result_name>BETA_BETA_9999986_0370_0</result_name> <checkpoint_cpu_time>11129.480000</checkpoint_cpu_time> <checkpoint_elapsed_time>11946.453125</checkpoint_elapsed_time> <fraction_done>0.500000</fraction_done> </active_task> Maybe the log could print like the other projects an up front number of passes, lest as was indicated in one reply, the task going anywhere the wind blows the search direction, but 50% at pass 1 would imply differently. [Edit 1 times, last edit by Former Member at Oct 31, 2013 9:47:11 PM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1316 Status: Offline Project Badges: |
I wasn't lucky to get 1 of the initial 10,000, but meanwhile fished 2 resends. First task: Wingman error with "Maximum disk usage exceeded". To see how big the upload file may grow, I extended the bound 10 times. Checkpoints after about every 10 minutes. Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB. Second resend: Wingman returned with "exit code -1073740940 (0xc0000374)" that normally means memory access violated. After 10 minutes the 0.5% issue, the task run to the 1st checkpoint after 19 minutes runtime, but 7 minutes later no CPU-usage anymore. Suspend with LAIM off and resume, the task restarts with 0.5% progress and it seems it's running a bit further now. Crystal Pellet, I double checked the settings, and it shows a ram limit of 400MB, Can you provide me with the work unit name that gave you the issue? Thanks, -Uplinger Hi Keith, I oversaw your request, but meanwhile continued my report about that task here -> http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=436360 Cheers, CP |
||
|
OldChap
Veteran Cruncher UK Joined: Jun 5, 2009 Post Count: 978 Status: Offline Project Badges: |
Just forced mine to the top on E5-26xx @2.4 and linux mint 64.
----------------------------------------Initial observations: 1st wu to start is at 0.5% after 1h20m cpu time, the remainder are showing "normal" progression part % at a time. Of the 9 I got on this machine, 8 are showing a target time of 3h 35m but the 9th is over 10 hours. so this worst case wu has progress so far which is currently on target for around 17 hours runtime vs others @ around 4hrs http://www.lakecityquietpills.com/photo/multi.../92655495971280765031.png 1 hour later: http://www.lakecityquietpills.com/photo/multi.../88927144893783083966.png Question: Where to look for checkpointing info during running? EDIT: At around 2h 30 of runtime the one wu seemingly stuck at 0.5%, but using 99% cpu thread, jumped to 50% [Edit 2 times, last edit by OldChap at Oct 31, 2013 9:59:01 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I wasn't lucky to get 1 of the initial 10,000, but meanwhile fished 2 resends. First task: Wingman error with "Maximum disk usage exceeded". To see how big the upload file may grow, I extended the bound 10 times. Checkpoints after about every 10 minutes. Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB. Second resend: Wingman returned with "exit code -1073740940 (0xc0000374)" that normally means memory access violated. After 10 minutes the 0.5% issue, the task run to the 1st checkpoint after 19 minutes runtime, but 7 minutes later no CPU-usage anymore. Suspend with LAIM off and resume, the task restarts with 0.5% progress and it seems it's running a bit further now. Crystal Pellet, I double checked the settings, and it shows a ram limit of 400MB, Can you provide me with the work unit name that gave you the issue? Thanks, -Uplinger Hi Keith, I oversaw your request, but meanwhile continued my report about that task here -> http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=436360 Cheers, CP Thought you wrote earlier that you'd manipulated the bound values, then restarting the task to see how for it would go... 10x? |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
Uplinger,
----------------------------------------In previous beta runs I would get somewhere close to 1 task per CPU thread. Not so this time. What's changed?
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
branjo
Master Cruncher Slovakia Joined: Jun 29, 2012 Post Count: 1892 Status: Offline Project Badges: |
nano, I got exactly 1 WU per CPU thread from original batch and some resends later.
----------------------------------------Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006 [Edit 2 times, last edit by branjo at Oct 31, 2013 10:03:25 PM] |
||
|
-Helle-
Cruncher Denmark Joined: Feb 27, 2010 Post Count: 28 Status: Offline Project Badges: |
I also saw error -131 for most of my wu's. But also saw quite a big difference between run time and CPU time (as already reported by others)
---------------------------------------- |
||
|
|