Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Beta Testing Forum: Beta Test Support Forum Thread: New Beta Test starting Oct 31, 2013 [Issues Thread] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 211
|
Author |
|
armstrdj
Former World Community Grid Tech Joined: Oct 21, 2004 Post Count: 695 Status: Offline Project Badges: |
First thanks to all of the Beta testers. This new application has a larger number of different types of workunits than normal. This results in several possible paths through the code and some variance in workunit completion time and time between checkpoints.
The issue with workunits being stuck at 0.5% is an issue with the percent complete calculation. If the application is using cpu time it is probably not stuck so please let it continue to run for a while to verify. The 0.5% is some initialization which is taking longer for some of the types of workunits. We will correct this and determine a better way to show progress during the initialization. The application should checkpoint at most every ten minutes. This is because some of the types of workunits produce larger checkpoint files and this will help limit disk I/O. Every workunit should checkpoint so we are looking into the ones that are not checkpointing. The large output file errors are an issue with this being new research. We anticipated a smaller percentage of these so I apologize there are more than anticipated. The applicaitoin includes input parameters to limit the output sent back to the most critical and these parameters will be enhanced as we go. The next beta should have a very minimal amount of these. All users are getting credit for these errors. Thanks again for all the help in getting new research running. Thanks, armstrdj |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1316 Status: Offline Project Badges: |
...Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB. That was just the beginning. Meanwhile memory usage 2,770MB. |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Fast (r)evolving thread... another post in between what's being replied to so inserting quote
----------------------------------------I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one. Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.[Edit 1 times, last edit by Former Member at Oct 31, 2013 3:30:39 PM] |
||
|
mwgiii
Advanced Cruncher United States Joined: Aug 17, 2006 Post Count: 131 Status: Offline Project Badges: |
My Win 8.1 Intel Q9450 quad received 5:
----------------------------------------BETA_BETA_9999988_0000 - Eror - Maximum disk usage exceeded - same with all 5 replications BETA_BETA_9999985_0949 - Error - Wingman errored, 2 replications in progress </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>BETA_BETA_9999985_0949_1_0</file_name> <error_code>-131</error_code> </file_xfer_error> BETA_BETA_9999986_0895 - Pending Validation - Wingman in progress BETA_BETA_9999985_0952 - Pending Validation - Wingman in progress BETA_BETA_9999985_0947 - Valid My old Ubuntu 12.0.3 AMD Turion 64x2 received 2: BETA_BETA_9999985_0684 - Pending Validation - Wingman in progress BETA_BETA_9999985_0089 - Pending Validation - Wingman errored out, replication is in progress [Edit 1 times, last edit by mwgiii at Oct 31, 2013 3:35:00 PM] |
||
|
Dataman
Ace Cruncher Joined: Nov 16, 2004 Post Count: 4865 Status: Offline Project Badges: |
Fast (r)evolving thread... another post in between what's being replied to so inserting quote I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one. Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.Thanks Rob but there are no competing CPU projects running. Only the two GPU's in each machine are running and they are using almost no CPU. No worries though ... just have never seen that particular message before. |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
I wasn't lucky to get 1 of the initial 10,000, but meanwhile fished 2 resends. First task: Wingman error with "Maximum disk usage exceeded". To see how big the upload file may grow, I extended the bound 10 times. Checkpoints after about every 10 minutes. Keith Uplinger mentioned memory usage of 400MB, but my first still running resend already has a peak of 1,692MB. Second resend: Wingman returned with "exit code -1073740940 (0xc0000374)" that normally means memory access violated. After 10 minutes the 0.5% issue, the task run to the 1st checkpoint after 19 minutes runtime, but 7 minutes later no CPU-usage anymore. Suspend with LAIM off and resume, the task restarts with 0.5% progress and it seems it's running a bit further now. Crystal Pellet, I double checked the settings, and it shows a ram limit of 400MB, Can you provide me with the work unit name that gave you the issue? Thanks, -Uplinger |
||
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges: |
This may be beating a dead horse (cavallo for Rob) by now, but I am running on a single core of a Core2Duo E8400 (Win7 64-bit), and also get the file size exceeded limit error.
</stderr_txt> <message> upload failure: <file_xfer_error> <file_name>BETA_BETA_9999986_0205_2_0</file_name> <error_code>-131</error_code> </file_xfer_error> 641 World Community Grid 10/31/2013 11:32:39 AM Computation for task BETA_BETA_9999986_0205_2 finished 642 World Community Grid 10/31/2013 11:32:39 AM Output file BETA_BETA_9999986_0205_2_0 for task BETA_BETA_9999986_0205_2 exceeds size limit. 643 World Community Grid 10/31/2013 11:32:39 AM File size: 11077900.000000 bytes. Limit: 10485760.000000 bytes It ran fine until the end (3 hours 8 minutes), and then errored out at 100% completed. I did not interrupt it to check for checkpointing. |
||
|
Mamajuanauk
Master Cruncher United Kingdom Joined: Dec 15, 2012 Post Count: 1900 Status: Offline Project Badges: |
Errors
----------------------------------------I received about 10 on one machine, about 4 of which have errored: Thu 31 Oct 2013 11:46:49 GMT | World Community Grid | Computation for task BETA_BETA_9999986_0078_0 finished Thu 31 Oct 2013 11:46:49 GMT | World Community Grid | Output file BETA_BETA_9999986_0078_0_0 for task BETA_BETA_9999986_0078_0 exceeds size limit. Thu 31 Oct 2013 11:46:49 GMT | World Community Grid | File size: 111823077.000000 bytes. Limit: 10485760.000000 bytes Thu 31 Oct 2013 15:01:57 GMT | World Community Grid | Computation for task BETA_BETA_9999984_0855_2 finished Thu 31 Oct 2013 15:01:57 GMT | World Community Grid | Output file BETA_BETA_9999984_0855_2_0 for task BETA_BETA_9999984_0855_2 exceeds size limit. Thu 31 Oct 2013 15:01:57 GMT | World Community Grid | File size: 91650959.000000 bytes. Limit: 10485760.000000 bytes Thu 31 Oct 2013 11:25:37 GMT | World Community Grid | Computation for task BETA_BETA_9999987_0902_0 finished Thu 31 Oct 2013 11:25:37 GMT | World Community Grid | Output file BETA_BETA_9999987_0902_0_0 for task BETA_BETA_9999987_0902_0 exceeds size limit. Thu 31 Oct 2013 11:25:37 GMT | World Community Grid | File size: 47184733.000000 bytes. Limit: 10485760.000000 bytes Thu 31 Oct 2013 10:41:48 GMT | World Community Grid | Computation for task BETA_BETA_9999985_0205_1 finished Thu 31 Oct 2013 10:41:48 GMT | World Community Grid | Output file BETA_BETA_9999985_0205_1_0 for task BETA_BETA_9999985_0205_1 exceeds size limit. Thu 31 Oct 2013 10:41:48 GMT | World Community Grid | File size: 11479651.000000 bytes. Limit: 10485760.000000 bytes Ubuntu 13.01/64bit/loads of HDD space
Mamajuanauk is the Name! Crunching is the Game!
|
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: |
Fast (r)evolving thread... another post in between what's being replied to so inserting quote I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one. Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.The "not requesting tasks: project is not highest priority" message shows up in BOINC all the time even if WCG is by far set to the highest priority. (BOINC 300%, other project 1%) It usually sorts itself out but sometimes it takes a while. FWIW I have found the problem is less frequent if you set the lower priority project to 0%.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
|
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Fast (r)evolving thread... another post in between what's being replied to so inserting quote I am getting an error message, "not requesting tasks: project is not highest priority." confused I got one Beta which seems to be running normally. Checkpointing looks OK but I am not going to try to restart it as I only have one. Your own doing, by crunching elsewhere too. WCG is not highest [fetch] priority according your client scheduler. Suspend your other projects and you'd like see that change. It's not an 'error' BTW, just a user warning, too those that look in the client event log. User intervention requiring errors actually cause a 'notice' to be generated.Thanks Rob but there are no competing CPU projects running. Only the two GPU's in each machine are running and they are using almost no CPU. No worries though ... just have never seen that particular message before. Yes, we've heard of starving CPU's because of GPU's... they too compete for CPU resources. You might want to try newest clients [7.2.26 is out now and using it on my octo], but can tell you up front from own testing [and alpha reporting], there is still issues over idling CPU's under certain [reproducible] conditions. Fixed? No dev reply, to which I refrain to opine on, can guess what pressure there might be, which is an opine. Other testers have confirmed same problem. [Edit 1 times, last edit by Former Member at Oct 31, 2013 4:11:55 PM] |
||
|
|