Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 46
|
![]() |
Author |
|
marist_college
Advanced Cruncher USA Joined: Mar 30, 2005 Post Count: 107 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm seeing several of the following:
----------------------------------------<message> upload failure: <file_xfer_error> <file_name>ZIKA_000000866_x1nb7_HCVJ4_RNAPol_JustProt_chnB_0340_1_r828549218_0</file_name> <error_code>-131 (file size too big)</error_code> </file_xfer_error> </message> Above this are hundreds of lines of text showing the 187 tasks started and completed. This ran for 9.28 hours and claimed credit was 585.3 (granted 0.0). Is there anything that can be done on either end to fix this? Obviously, it's a waste of resources for these to complete fine, but error out on upload. Thanks. ![]() |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
NM. Not zika task. *brain cramp*
----------------------------------------
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
----------------------------------------![]() ![]() [Edit 1 times, last edit by nanoprobe at May 21, 2016 6:21:23 PM] |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
This one is known as a race condition, from what techs posted in past. The file was written, then somehow not released and attempted to be written again.
----------------------------------------Edit: referred to the now removed nonaprobe log, not the OP. [Edit 1 times, last edit by SekeRob* at May 21, 2016 7:06:05 PM] |
||
|
SekeRob
Master Cruncher Joined: Jan 7, 2013 Post Count: 2741 Status: Offline |
Braincramp.... Figures... looked too familiar.
|
||
|
marist_college
Advanced Cruncher USA Joined: Mar 30, 2005 Post Count: 107 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have 23 reporting this way. All this type of WU: ZIKA_000000###_x1nb7_HCVJ4_RNAPol_JustProt_chnB_0###
----------------------------------------Here's runtime and claims for all the errored WUs: 11.76 449.9 8.46 309 9.28 585.3 8.85 456.9 10.58 313 10.71 425.8 9.08 273.2 6.99 346.2 9.54 374.4 9.71 488.4 9.47 292.9 12.85 538.6 7.86 238.6 8.17 444.2 12.65 21.7? 10.36 428.6 11.48 449.7 7.58 272 15.56 21.7? 8.57 367.1 15.06 21.7? 9.2 328.6 8.18 397.7 Can any techs weigh in on this? Seems like a lot of racing. I'm not seeing this with other projects either. Do any of these count for anything? (I searched for credit and error in the FAQ, but didn't see anything obvious) ![]() |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm going to double the result file size to see if this helps. Seems like lots of small ligands are being packaged together in a workunit.
Thanks, -Uplinger |
||
|
marist_college
Advanced Cruncher USA Joined: Mar 30, 2005 Post Count: 107 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks, Uplinger. That's what it appears to be in terms of bundling. I'll keep an eye on the errors and let you know if I see more or if it improves.
----------------------------------------![]() |
||
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hopefully it should improve over the next day or so. I updated all results, but if your client has already downloaded the result before the change, it will still encounter the file too big issue.
----------------------------------------With all vina projects, the estimator script learns from previous work units that have completed. It runs every 12 hours and updates the estimated runtimes for a ligand to help normalize the work unit runtime lengths. We populated the estimates with runtimes from other vina projects to start, but once we get results from zika only, it starts to get better. For example, X 1 means it was pre populated, with estimated time for that ligand to calculate of C <seconds> T 17 A 37 C 298.76 X 1 T 17 A 37 C 971.238 X 4 From the before and after, you can see the estimate was off by 3x, new work units will use the updated value and give better estimates. Thanks, -Uplinger [Edited to add more details -Uplinger] [Edit 1 times, last edit by uplinger at May 21, 2016 8:47:54 PM] |
||
|
nanoprobe
Master Cruncher Classified Joined: Aug 29, 2008 Post Count: 2998 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I have several of these now running on my server. 1 is at 24 hours with an estimated 4 hours left. Several others are at 14 hours with 1-5 hours left. Don't know if these were from the same batch that Marist got but I'll update as they finish.
----------------------------------------EDIT: FWIW I just scrolled through my desktops and several of them have these tasks running and the longest one looks like it will take about 5 hours. Is there an explanation as to why the tasks running on the server are taking up to 5x longer when the desktops have less than 2x the CPU speed. Server memory use is only at 27%. EDIT 2: Just had 1 report as error with same code as Marist. Ran for 15.34 hours but only 278.4 points claimed. ![]() Result Name: ZIKA_ 000000951_ x1nb7_ HCVJ4_ RNAPol_ JustProt_ chnB_ 0308_ 0-- </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>ZIKA_000000951_x1nb7_HCVJ4_RNAPol_JustProt_chnB_0308_0_r756640083_0</file_name> <error_code>-131 (file size too big)</error_code> </file_xfer_error> </message> ]]> EDIT:3 The other 4 of that batch that I had are finished. 1 ran for 28 hours and ended with the file size too big error. The other 3 ran 12, 11 and 8 hours and they validated. Hope this helps.
In 1969 I took an oath to defend and protect the U S Constitution against all enemies, both foreign and Domestic. There was no expiration date.
----------------------------------------![]() ![]() [Edit 3 times, last edit by nanoprobe at May 22, 2016 4:59:51 PM] |
||
|
Jason1478963
Senior Cruncher United States Joined: Sep 18, 2005 Post Count: 295 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Also Getting a few results with very long run times ( 10.7-11.5 hrs.) and eventually ending up with upload error. Normal runtimes for this machine range from 50 min to 4.25 hrs.
----------------------------------------System: Mint 13_X64 - Opteron 2431 - Boinc 7.0.27 Result Log Result Name: ZIKA_ 000000928_ x1nb7_ HCVJ4_ RNAPol_ JustProt_ chnB_ 0308_ 0-- <core_client_version>7.0.27</core_client_version> <![CDATA[ <stderr_txt> INFO: No state to restore. Start from the beginning. [19:35:06] Number of tasks = 175 [19:35:06] Running task 0,CPU time at start of task 0 was 0.000000 [19:35:06] ./ZINC30518529_1.pdbqt size = 22 6 ../../projects/www.worldcommunitygrid.org/26aa5e8d1405aee7673165325575ecc7.pdbqt size = 5417 0 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Deleted txt to shorten ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [10:37:32] Finished task #172 cpu time used 277.485341 [10:37:32] Running task 173,CPU time at start of task 173 was 40680.310359 [10:37:32] ./ZINC32654895.pdbqt size = 28 6 ../../projects/www.worldcommunitygrid.org/26aa5e8d1405aee7673165325575ecc7.pdbqt size = 5417 0 [10:41:45] Finished task #173 cpu time used 252.887805 [10:41:45] Running task 174,CPU time at start of task 174 was 40933.198164 [10:41:45] ./ZINC32654896.pdbqt size = 28 6 ../../projects/www.worldcommunitygrid.org/26aa5e8d1405aee7673165325575ecc7.pdbqt size = 5417 0 [10:45:53] Finished task #174 cpu time used 247.379460 10:45:53 (18033): called boinc_finish(0) </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>ZIKA_000000928_x1nb7_HCVJ4_RNAPol_JustProt_chnB_0308_0_r272880322_0</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ]]> Close Return to Top ![]() ![]() ![]() |
||
|
|
![]() |