Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 211
Posts: 211   Pages: 22   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 29089 times and has 210 replies Next Thread
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

First thanks to all of the Beta testers. This new application has a larger number of different types of workunits than normal. This results in several possible paths through the code and some variance in workunit completion time and time between checkpoints.

The issue with workunits being stuck at 0.5% is an issue with the percent complete calculation. If the application is using cpu time it is probably not stuck so please let it continue to run for a while to verify. The 0.5% is some initialization which is taking longer for some of the types of workunits. We will correct this and determine a better way to show progress during the initialization.

The application should checkpoint at most every ten minutes. This is because some of the types of workunits produce larger checkpoint files and this will help limit disk I/O. Every workunit should checkpoint so we are looking into the ones that are not checkpointing.

The large output file errors are an issue with this being new research. We anticipated a smaller percentage of these so I apologize there are more than anticipated. The applicaitoin includes input parameters to limit the output sent back to the most critical and these parameters will be enhanced as we go. The next beta should have a very minimal amount of these. All users are getting credit for these errors.

Thanks again for all the help in getting new research running.

Thanks,
armstrdj
[Oct 31, 2013 3:22:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Crystal Pellet
Veteran Cruncher
Joined: May 21, 2008
Post Count: 1316
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

...Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB.

That was just the beginning. Meanwhile memory usage 2,770MB.
----------------------------------------

[Oct 31, 2013 3:23:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Fast (r)evolving thread... another post in between what's being replied to so inserting quote
I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one.
Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 31, 2013 3:30:39 PM]
[Oct 31, 2013 3:26:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mwgiii
Advanced Cruncher
United States
Joined: Aug 17, 2006
Post Count: 131
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

My Win 8.1 Intel Q9450 quad received 5:
BETA_BETA_9999988_0000 - Eror - Maximum disk usage exceeded - same with all 5 replications

BETA_BETA_9999985_0949 - Error - Wingman errored, 2 replications in progress
</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>BETA_BETA_9999985_0949_1_0</file_name>
<error_code>-131</error_code>
</file_xfer_error>

BETA_BETA_9999986_0895 - Pending Validation - Wingman in progress

BETA_BETA_9999985_0952 - Pending Validation - Wingman in progress

BETA_BETA_9999985_0947 - Valid

My old Ubuntu 12.0.3 AMD Turion 64x2 received 2:
BETA_BETA_9999985_0684 - Pending Validation - Wingman in progress

BETA_BETA_9999985_0089 - Pending Validation - Wingman errored out, replication is in progress
----------------------------------------

----------------------------------------
[Edit 1 times, last edit by mwgiii at Oct 31, 2013 3:35:00 PM]
[Oct 31, 2013 3:31:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Fast (r)evolving thread... another post in between what's being replied to so inserting quote
I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one.
Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.

Thanks Rob but there are no competing CPU projects running. Only the two GPU's in each machine are running and they are using almost no CPU. No worries though ... just have never seen that particular message before.
peace
----------------------------------------


[Oct 31, 2013 3:40:56 PM]   Link   Report threatening or abusive post: please login first  Go to top 
uplinger
Former World Community Grid Tech
Joined: May 23, 2005
Post Count: 3952
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

I wasn't lucky to get 1 of the initial 10,000, but meanwhile fished 2 resends.

First task: Wingman error with "Maximum disk usage exceeded". To see how big the upload file may grow, I extended the bound 10 times.
Checkpoints after about every 10 minutes.
Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB.

Second resend: Wingman returned with "exit code -1073740940 (0xc0000374)" that normally means memory access violated.
After 10 minutes the 0.5% issue, the task run to the 1st checkpoint after 19 minutes runtime, but 7 minutes later no CPU-usage anymore.
Suspend with LAIM off and resume, the task restarts with 0.5% progress and it seems it's running a bit further now.


Crystal Pellet,

I double checked the settings, and it shows a ram limit of 400MB, Can you provide me with the work unit name that gave you the issue?

Thanks,
-Uplinger
[Oct 31, 2013 3:49:41 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Jim1348
Veteran Cruncher
USA
Joined: Jul 13, 2009
Post Count: 1066
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

This may be beating a dead horse (cavallo for Rob) by now, but I am running on a single core of a Core2Duo E8400 (Win7 64-bit), and also get the file size exceeded limit error.

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>BETA_BETA_9999986_0205_2_0</file_name>
<error_code>-131</error_code>
</file_xfer_error>

641 World Community Grid 10/31/2013 11:32:39 AM Computation for task BETA_BETA_9999986_0205_2 finished
642 World Community Grid 10/31/2013 11:32:39 AM Output file BETA_BETA_9999986_0205_2_0 for task BETA_BETA_9999986_0205_2 exceeds size limit.
643 World Community Grid 10/31/2013 11:32:39 AM File size: 11077900.000000 bytes. Limit: 10485760.000000 bytes

It ran fine until the end (3 hours 8 minutes), and then errored out at 100% completed. I did not interrupt it to check for checkpointing.
[Oct 31, 2013 3:53:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Mamajuanauk
Master Cruncher
United Kingdom
Joined: Dec 15, 2012
Post Count: 1900
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Errors
I received about 10 on one machine, about 4 of which have errored:

Thu 31 Oct 2013 11:46:49 GMT | World Community Grid | Computation for task BETA_BETA_9999986_0078_0 finished
Thu 31 Oct 2013 11:46:49 GMT | World Community Grid | Output file BETA_BETA_9999986_0078_0_0 for task BETA_BETA_9999986_0078_0 exceeds size limit.
Thu 31 Oct 2013 11:46:49 GMT | World Community Grid | File size: 111823077.000000 bytes. Limit: 10485760.000000 bytes

Thu 31 Oct 2013 15:01:57 GMT | World Community Grid | Computation for task BETA_BETA_9999984_0855_2 finished
Thu 31 Oct 2013 15:01:57 GMT | World Community Grid | Output file BETA_BETA_9999984_0855_2_0 for task BETA_BETA_9999984_0855_2 exceeds size limit.
Thu 31 Oct 2013 15:01:57 GMT | World Community Grid | File size: 91650959.000000 bytes. Limit: 10485760.000000 bytes

Thu 31 Oct 2013 11:25:37 GMT | World Community Grid | Computation for task BETA_BETA_9999987_0902_0 finished
Thu 31 Oct 2013 11:25:37 GMT | World Community Grid | Output file BETA_BETA_9999987_0902_0_0 for task BETA_BETA_9999987_0902_0 exceeds size limit.
Thu 31 Oct 2013 11:25:37 GMT | World Community Grid | File size: 47184733.000000 bytes. Limit: 10485760.000000 bytes

Thu 31 Oct 2013 10:41:48 GMT | World Community Grid | Computation for task BETA_BETA_9999985_0205_1 finished
Thu 31 Oct 2013 10:41:48 GMT | World Community Grid | Output file BETA_BETA_9999985_0205_1_0 for task BETA_BETA_9999985_0205_1 exceeds size limit.
Thu 31 Oct 2013 10:41:48 GMT | World Community Grid | File size: 11479651.000000 bytes. Limit: 10485760.000000 bytes


Ubuntu 13.01/64bit/loads of HDD space
----------------------------------------
Mamajuanauk is the Name! Crunching is the Game!



[Oct 31, 2013 3:55:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
nanoprobe
Master Cruncher
Classified
Joined: Aug 29, 2008
Post Count: 2998
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Fast (r)evolving thread... another post in between what's being replied to so inserting quote
I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one.
Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.

The "not requesting tasks: project is not highest priority" message shows up in BOINC all the time even if WCG is by far set to the highest priority. (BOINC 300%, other project 1%) It usually sorts itself out but sometimes it takes a while. FWIW I have found the problem is less frequent if you set the lower priority project to 0%.
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.


[Oct 31, 2013 4:03:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New Beta Test starting Oct 31, 2013 [Issues Thread]

Fast (r)evolving thread... another post in between what's being replied to so inserting quote
I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one.
Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.

Thanks Rob but there are no competing CPU projects running. Only the two GPU's in each machine are running and they are using almost no CPU. No worries though ... just have never seen that particular message before.
peace

Yes, we've heard of starving CPU's because of GPU's... they too compete for CPU resources. You might want to try newest clients [7.2.26 is out now and using it on my octo], but can tell you up front from own testing [and alpha reporting], there is still issues over idling CPU's under certain [reproducible] conditions. Fixed? No dev reply, to which I refrain to opine on, can guess what pressure there might be, which is an opine. Other testers have confirmed same problem.
----------------------------------------
[Edit 1 times, last edit by Former Member at Oct 31, 2013 4:11:55 PM]
[Oct 31, 2013 4:07:57 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 211   Pages: 22   [ Previous Page | 2 3 4 5 6 7 8 9 10 11 | Next Page ]
[ Jump to Last Post ]
Post new Thread