Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 211
Posts: 211   Pages: 22   [ Previous Page | 7 8 9 10 11 12 13 14 15 16 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 29159 times and has 210 replies Next Thread
ca05065
Senior Cruncher
Joined: Dec 4, 2007
Post Count: 325
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I had two beta work units and they were very different. The one that worked had only 13 computer passes. The one which failed with the error code 131 had over 5000 when I looked at stderr in the slots directory but only reported 1981 computer passes in results status file.
[Oct 31, 2013 9:33:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dark Angel
Veteran Cruncher
Australia
Joined: Nov 11, 2005
Post Count: 721
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Completed with error.

[QUOTE]Run complete, CPU time: 22246.801971
00:04:30 (19197): called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>BETA_BETA_9999986_0565_1_0</file_name>
<error_code>-131</error_code>
</file_xfer_error>

</message>[/QUOTE]
----------------------------------------

Currently being moderated under false pretences
[Oct 31, 2013 9:35:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
ski1939
Senior Cruncher
Freeport, Illinois, USA
Joined: Nov 17, 2004
Post Count: 209
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

run complete then xfer file error code 131

(e.g. results file size 45.22mb exceeds 10mb limit)

26-10-2013 13:06 Starting BOINC client version 6.10.58 for windows_x86_64
26-10-2013 13:07 Processor: 2 GenuineIntel Pentium(R) Dual-Core CPU T4300 @ 2.10GHz [Family 6 Model 23 Stepping 10]
26-10-2013 13:07 Processor: 1.00 MB cache
26-10-2013 13:07 Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 nx lm tm2 pbe
26-10-2013 13:07 OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
26-10-2013 13:07 Memory: 2.96 GB physical, 5.92 GB virtual
26-10-2013 13:07 Disk: 465.66 GB total, 133.56 GB free

- - - - - - - - - - - - - - - - - -

31-10-2013 14:57 Computation for task BETA_BETA_9999988_0674_1 finished
31-10-2013 14:57 Output file BETA_BETA_9999988_0674_1_0 for task BETA_BETA_9999988_0674_1 exceeds size limit.
31-10-2013 14:57 File size: 47425541.000000 bytes. Limit: 10485760.000000 bytes

- - - - - - - - - - - - - - - - - -

[14:54:57]: Computing pass 7822
Run complete, CPU time: 18133.072637
14:57:03 (7260): called boinc_finish

<file_xfer_error>
<file_name>BETA_BETA_9999988_0674_1_0</file_name>
<error_code>-131</error_code>
</file_xfer_error>

- - - - - - - - - - - - - - - - - -
----------------------------------------
flag The Few, The Proud, The Marines - Semper Fi flag



flag ski1939 flag Click here to view or join team USMCflag
[Oct 31, 2013 9:36:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I had two beta work units and they were very different. The one that worked had only 13 computer passes. The one which failed with the error code 131 had over 5000 when I looked at stderr in the slots directory but only reported 1981 computer passes in results status file.
Think uplinger or armstrdj said that in case the file is too big, it will send some [useful] portion of the result back... something along that line.

edit: The slots content is anyway interesting, such as the base science app only 1K in size, multiple soft-link files that link to bigger ones on the project folder, a log file that details CPU and Lapse time and when checkpointed and more. E.g. The boinc_task_state.xml file of a task that just would not budge past .5% until near 4 hours into the job, then jumping to 50%, implying there will only be 2 passes in this job:

<active_task>
<project_master_url>http://www.worldcommunitygrid.org/</project_master_url>
<result_name>BETA_BETA_9999986_0370_0</result_name>
<checkpoint_cpu_time>11129.480000</checkpoint_cpu_time>
<checkpoint_elapsed_time>11946.453125</checkpoint_elapsed_time>
<fraction_done>0.500000</fraction_done>
</active_task>

Maybe the log could print like the other projects an up front number of passes, lest as was indicated in one reply, the task going anywhere the wind blows the search direction, but 50% at pass 1 would imply differently.
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 31, 2013 9:47:11 PM]
[Oct 31, 2013 9:40:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I wasn't lucky to get 1 of the initial 10,000, but meanwhile fished 2 resends.

First task: Wingman error with "Maximum disk usage exceeded". To see how big the upload file may grow, I extended the bound 10 times.
Checkpoints after about every 10 minutes.
Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB.

Second resend: Wingman returned with "exit code -1073740940 (0xc0000374)" that normally means memory access violated.
After 10 minutes the 0.5% issue, the task run to the 1st checkpoint after 19 minutes runtime, but 7 minutes later no CPU-usage anymore.
Suspend with LAIM off and resume, the task restarts with 0.5% progress and it seems it's running a bit further now.


Crystal Pellet,

I double checked the settings, and it shows a ram limit of 400MB, Can you provide me with the work unit name that gave you the issue?

Thanks,
-Uplinger

Hi Keith,

I oversaw your request, but meanwhile continued my report about that task here -> http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=436360

Cheers,
CP
----------------------------------------

[Oct 31, 2013 9:43:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
OldChap
Veteran Cruncher
UK
Joined: Jun 5, 2009
Post Count: 978
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Just forced mine to the top on E5-26xx @2.4 and linux mint 64.

Initial observations:

1st wu to start is at 0.5% after 1h20m cpu time, the remainder are showing "normal" progression part % at a time.

Of the 9 I got on this machine, 8 are showing a target time of 3h 35m but the 9th is over 10 hours. so this worst case wu has progress so far which is currently on target for around 17 hours runtime vs others @ around 4hrs

http://www.lakecityquietpills.com/photo/multi.../92655495971280765031.png

1 hour later:

http://www.lakecityquietpills.com/photo/multi.../88927144893783083966.png

Question: Where to look for checkpointing info during running?

EDIT: At around 2h 30 of runtime the one wu seemingly stuck at 0.5%, but using 99% cpu thread, jumped to 50%
----------------------------------------

----------------------------------------
[Edit 2 times, last edit by OldChap at Oct 31, 2013 9:59:01 PM]
[Oct 31, 2013 9:45:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I wasn't lucky to get 1 of the initial 10,000, but meanwhile fished 2 resends.

First task: Wingman error with "Maximum disk usage exceeded". To see how big the upload file may grow, I extended the bound 10 times.
Checkpoints after about every 10 minutes.
Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB.

Second resend: Wingman returned with "exit code -1073740940 (0xc0000374)" that normally means memory access violated.
After 10 minutes the 0.5% issue, the task run to the 1st checkpoint after 19 minutes runtime, but 7 minutes later no CPU-usage anymore.
Suspend with LAIM off and resume, the task restarts with 0.5% progress and it seems it's running a bit further now.


Crystal Pellet,

I double checked the settings, and it shows a ram limit of 400MB, Can you provide me with the work unit name that gave you the issue?

Thanks,
-Uplinger

Hi Keith,

I oversaw your request, but meanwhile continued my report about that task here -> http://www.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=436360

Cheers,
CP

Thought you wrote earlier that you'd manipulated the bound values, then restarting the task to see how for it would go... 10x?
[Oct 31, 2013 9:51:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Uplinger,
In previous beta runs I would get somewhere close to 1 task per CPU thread. Not so this time. What's changed?
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Oct 31, 2013 9:52:33 PM]   Link   Report threatening or abusive post: please login first  Go to top 
branjo
Master Cruncher
Slovakia
Joined: Jun 29, 2012
Post Count: 1892
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

nano, I got exactly 1 WU per CPU thread from original batch and some resends later.
----------------------------------------

Crunching@Home since January 13 2000. Shrubbing@Home since January 5 2006

----------------------------------------
[Edit 2 times, last edit by branjo at Oct 31, 2013 10:03:25 PM]
[Oct 31, 2013 10:01:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
-Helle-
Cruncher
Denmark
Joined: Feb 27, 2010
Post Count: 28
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I also saw error -131 for most of my wu's. But also saw quite a big difference between run time and CPU time (as already reported by others)
----------------------------------------

[Oct 31, 2013 10:13:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 211   Pages: 22   [ Previous Page | 7 8 9 10 11 12 13 14 15 16 | Next Page ]
[ Jump to Last Post ]
Post new Thread