Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 211
Posts: 211   Pages: 22   [ Previous Page | 11 12 13 14 15 16 17 18 19 20 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 29092 times and has 210 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

TYT, CP.

Let's see if I can summarize all the issues, OTOH:

1) Output file too large (Error -131)
2) Maximum Disk Use Exceeded (disk_bound overstepped)
3) Memory model exceeded (memory_bound overstepped)
4) Loss of -large- portions of CPU time at time of reporting, which looks to happen at end.
5) Progress % erratic (e.g. happens it can from 0.5% to 50% only at end of 1st pass when there are only 2 passes)
6) Related to 5), checkpoints at times multiple hours apart... not good for part time crunchers.
7) Jobs seem stuck in memory at times, [when seemingly no more progress is made]... wont unload, even when "Leave application in memory when suspended" is off. Full client restart required to get them to unload.
8) Some tasks freeze on the CPU time use when running [is it the display or is it the CPU time in Task Manager indicates no CPU time use?], while elapsed time keeps accumulating and progress % goes backward. Users of BOINC manager wont see this easily, to users of BOINCTasks it's obvious since both Elapsed and CPU time is shown.

Wish list: Printing of OS and CPU details in Result Log.

Did I miss any? (Copy list and insert 9) and so on.

The DIY department.

P.S. Whilst the 10,000 originals left the feeder in 1.5 hours and were sent quite early in the day, 'only' 3,556 had validated at midnight [Don't know but doubt the 'error' results that were credited were included... my own count, 9 shown at midnight as valid on My Grid, but 18 listed with credit, including 9 with error. 2 in PV].
[Nov 1, 2013 9:56:38 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Thargor
Veteran Cruncher
UK
Joined: Feb 3, 2012
Post Count: 1291
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Task #1, on debian 6.0.7 (Squeeze):

- Fri 01 Nov 2013 08:33:10 GMT World Community Grid Computation for task BETA_BETA_9999984_0877_1 finished
- Fri 01 Nov 2013 08:33:10 GMT World Community Grid Output file BETA_BETA_9999984_0877_1_0 for task BETA_BETA_9999984_0877_1 exceeds size limit.
- Fri 01 Nov 2013 08:33:10 GMT World Community Grid File size: 107813020.000000 bytes. Limit: 10485760.000000 bytes

Task #2, on a 64-bit Windows 7 PC, appears to be getting a variant of the 0.5% issue, but it's getting anything from 5-15 minutes into the task then appears to be completely restarting from scratch (had been doing this for 4-5 hours overnight). Haven't tried restarting the client, yet, noticed on the way out to work this morning - will pop the WU ID in here, when I get home later, if it still hasn't finished.
----------------------------------------

[Nov 1, 2013 10:31:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Well, that one that was running overnight finally finished, and just before I could take a look at it. The server logs show:

BETA_ BETA_ 9999987_ 0168_ 0-- <M/c-ID> Pending Validation 31/10/13 06:55:37 01/11/13 10:10:15 12.33 / 24.69 235.2 / 0.0

It clearly finished without recording any CPU time for the second half of the run, but at least it finished. I have no idea whether or not it checkpointed again, but I suspect not. One checkpoint near the beginning and then 24 hours running without one is clearly not a good idea ...

So far, all the others I've had were resends that all failed (again) with oversize output files.

Here's wishing the techs a successful time tracking down these and the other issues that people have seen.
[Nov 1, 2013 10:45:46 AM]   Link   Report threatening or abusive post: please login first  Go to top 
pramo
Veteran Cruncher
USA
Joined: Dec 14, 2005
Post Count: 703
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I aborted, this one-
had restarted every three minutes since it began about 21 hours ago. wingman hasn't reported either.

stderr.txt has this, repeating...

Commandline = projects/www.worldcommunitygrid.org/wcgrid_beta17_7.19_windows_intelx86 -SettingsFile BETA_9999984_0541.txt -DatabaseFile dataset-GDS2771-v1.txt
Initializing
wcg_learn_limit = 1000000
Running


Result log:
Result Name: BETA_ BETA_ 9999984_ 0541_ 0--

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
]]>

and now somone els has it:-/

edit to add-it claimed this!
BETA_ BETA_ 9999984_ 0541_ 0-- 719 User Aborted 10/31/13 06:16:14 11/1/13 11:28:40 0.00 65.9 / 0.0
----------------------------------------

----------------------------------------
[Edit 2 times, last edit by pramo at Nov 1, 2013 11:57:03 AM]
[Nov 1, 2013 11:29:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Thargor
Veteran Cruncher
UK
Joined: Feb 3, 2012
Post Count: 1291
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Commandline = projects/www.worldcommunitygrid.org/wcgrid_beta17_7.19_windows_intelx86 -SettingsFile BETA_9999984_0541.txt -DatabaseFile dataset-GDS2771-v1.txt
Initializing
wcg_learn_limit = 1000000
Running

going to abort this one-stderr.txt has the above, repeating...

has restarted every three minutes since it began about 21 hours ago. wingman hasn't reported either.
and now somone els has it:-/

That sounds very much like the issue I'm getting on my Windows 7 64-bit machine at home, but I didn't get chance to inspect the detailed logs.
----------------------------------------

[Nov 1, 2013 11:32:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
pramo
Veteran Cruncher
USA
Joined: Dec 14, 2005
Post Count: 703
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]


That sounds very much like the issue I'm getting on my Windows 7 64-bit machine at home, but I didn't get chance to inspect the detailed logs.


Wasnt much detail to see on that one:) Running on XP.
Now, I wish I had thought to restart/reboot before aborting to see what would happen. rats:(
----------------------------------------

[Nov 1, 2013 11:52:23 AM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2977
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

SekeRob, from your excellent summary of issues, I think one may have been missed one off (as to whether it's a concern or not, I don't have enough data - it may just be my machine/set-up, or it may be more widespread). Anyhow, as you requested, I'll copy your list of 8 and add a 9th onto the end;

1) Output file too large (Error -131)
2) Maximum Disk Use Exceeded (disk_bound overstepped)
3) Memory model exceeded (memory_bound overstepped)
4) Loss of -large- portions of CPU time at time of reporting, which looks to happen at end.
5) Progress % erratic (e.g. happens it can from 0.5% to 50% only at end of 1st pass when there are only 2 passes)
6) Related to 5), checkpoints at times multiple hours apart... not good for part time crunchers.
7) Jobs seem stuck in memory at times, [when seemingly no more progress is made]... wont unload, even when "Leave application in memory when suspended" is off. Full client restart required to get them to unload.
8) Some tasks freeze on the CPU time use when running [is it the display or is it the CPU time in Task Manager indicates no CPU time use?], while elapsed time keeps accumulating and progress % goes backward. Users of BOINC manager wont see this easily, to users of BOINCTasks it's obvious since both Elapsed and CPU time is shown.
9) Running 4 concurrently (i.e., using all available cores), appears to be very inefficient.
----------------------------------------

[Nov 1, 2013 1:19:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
coolstream
Senior Cruncher
SCOTLAND
Joined: Nov 8, 2005
Post Count: 475
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

One BETA grabbed which completed to 100% and then errored out with Result Log:
Result Name: BETA_ BETA_ 9999985_ 0880_ 4--

<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
puting pass 4128
[04:48:24]: Computing pass 4129
[04:48:28]: Computing pass 4130
...
[06:33:51]: Computing pass 6079
[06:33:54]: Computing pass 6080
Run complete, CPU time: 17194.570621
06:34:24 (29232): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
<file_name>BETA_BETA_9999985_0880_4_0</file_name>
<error_code>-131</error_code>
</file_xfer_error>

Microsoft Windows 7 Professional Service Pack 1 (build 7601), 64-bit
Processor: Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz
running 8 threads
8GB RAM
74GB free disk space
----------------------------------------

Crunching in memory of my Mum PEGGY, cousin ROPPA and Aunt AUDREY.
[Nov 1, 2013 1:57:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

11/1/2013 6:17:23 AM | World Community Grid | Output file BETA_BETA_9999988_0704_4_0 for task BETA_BETA_9999988_0704_4 exceeds size limit.
11/1/2013 6:17:23 AM | World Community Grid | File size: 16705286.000000 bytes. Limit: 10485760.000000 bytes


It happened again. File size to big. Well, I'm two for two. Batting 1000.
[Nov 1, 2013 2:23:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KWSN - A Shrubbery
Master Cruncher
Joined: Jan 8, 2006
Post Count: 1585
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

SekeRob, from your excellent summary of issues, I think one may have been missed one off (as to whether it's a concern or not, I don't have enough data - it may just be my machine/set-up, or it may be more widespread). Anyhow, as you requested, I'll copy your list of 8 and add a 9th onto the end;

1) Output file too large (Error -131)
2) Maximum Disk Use Exceeded (disk_bound overstepped)
3) Memory model exceeded (memory_bound overstepped)
4) Loss of -large- portions of CPU time at time of reporting, which looks to happen at end.
5) Progress % erratic (e.g. happens it can from 0.5% to 50% only at end of 1st pass when there are only 2 passes)
6) Related to 5), checkpoints at times multiple hours apart... not good for part time crunchers.
7) Jobs seem stuck in memory at times, [when seemingly no more progress is made]... wont unload, even when "Leave application in memory when suspended" is off. Full client restart required to get them to unload.
8) Some tasks freeze on the CPU time use when running [is it the display or is it the CPU time in Task Manager indicates no CPU time use?], while elapsed time keeps accumulating and progress % goes backward. Users of BOINC manager wont see this easily, to users of BOINCTasks it's obvious since both Elapsed and CPU time is shown.
9) Running 4 concurrently (i.e., using all available cores), appears to be very inefficient.

One of my machines had 8 tasks. All finished over 98% efficiency running concurrently. Couldn't tell you the specs off the top of my head but it's a rack server based off Intel(Xeon) running Ubuntu.

Of course of my 12 other machines they got one task between them. Go figure.
----------------------------------------

Distributed computing volunteer since September 27, 2000
[Nov 1, 2013 2:47:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 211   Pages: 22   [ Previous Page | 11 12 13 14 15 16 17 18 19 20 | Next Page ]
[ Jump to Last Post ]
Post new Thread