| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 48
|
|
| Author |
|
|
erich56
Senior Cruncher Austria Joined: Feb 24, 2007 Post Count: 300 Status: Offline Project Badges:
|
You can also add:
----------------------------------------OPNG_00866582 I wonder whether the project team could not remove all the unsent tasks from these faulty batches. [Edit 1 times, last edit by erich56 at Aug 18, 2021 7:17:02 PM] |
||
|
|
Greger
Cruncher Joined: Aug 1, 2013 Post Count: 29 Status: Offline Project Badges:
|
Got 135 failed task so far today with issue. They would not start so it would not hurt.
Hope upcoming batches would work better. |
||
|
|
erich56
Senior Cruncher Austria Joined: Feb 24, 2007 Post Count: 300 Status: Offline Project Badges:
|
They would not start so it would not hurt. yes, in a way you are right. On the other hand, as long as tasks come in that rarely, it's a pitty if from the the few tasks which are downloaded some then do not work. |
||
|
|
erich56
Senior Cruncher Austria Joined: Feb 24, 2007 Post Count: 300 Status: Offline Project Badges:
|
I've looked through my account (multiple machines), and found examples of this particular type of error from batches: OPNG_0086575 OPNG_0086577 OPNG_0086579 OPNG_0086584 OPNG_0086585 OPNG_0086589 OPNG_0086592 OPNG_0086594 OPNG_0086595 OPNG_0086596 OPNG_0086603 - all with the "The number of atom types found ... does not match" error, and all since 01:40 UTC today. add: OPNG_0086587 |
||
|
|
erich56
Senior Cruncher Austria Joined: Feb 24, 2007 Post Count: 300 Status: Offline Project Badges:
|
Got 135 failed task so far today with issue. They would not start so it would not hurt. not always though. I just watched a task which when it reached the 100% in the BOINC progress bar, it showed "computation error" :-( |
||
|
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 728 Status: Offline Project Badges:
|
I've only had seven errors but they're all the same across both rigs: autogrid4: ERROR: The number of atom types found in the receptor PDBQT (8) does not match the number specified by the "receptor_types" command (7) in the GPF!
----------------------------------------![]() Currently being moderated under false pretences |
||
|
|
Grumpy Swede
Master Cruncher Svíþjóð Joined: Apr 10, 2020 Post Count: 2495 Status: Recently Active Project Badges:
|
So, why doesn't the admins/techs stop these WU's from being sent out? Not a word from any of them.....
|
||
|
|
Acibant
Advanced Cruncher USA Joined: Apr 15, 2020 Post Count: 126 Status: Offline Project Badges:
|
So, why doesn't the admins/techs stop these WU's from being sent out? Not a word from any of them..... Clearly they were trying something different, given the difference in work unit sizes, but with those errors on so many I wish they tried them first through the beta testing route. Now there's a lot of wasted computation as work units are sent to two different machines for validation for all those who lost their reliable status due to getting an error through no fault of their own.![]() |
||
|
|
erich56
Senior Cruncher Austria Joined: Feb 24, 2007 Post Count: 300 Status: Offline Project Badges:
|
I've looked through my account (multiple machines), and found examples of this particular type of error from batches: OPNG_0086575 OPNG_0086577 OPNG_0086579 OPNG_0086584 OPNG_0086585 OPNG_0086589 OPNG_0086592 OPNG_0086594 OPNG_0086595 OPNG_0086596 OPNG_0086603 - all with the "The number of atom types found ... does not match" error, and all since 01:40 UTC today. add: OPNG_0086581 [Edit 1 times, last edit by erich56 at Aug 19, 2021 2:54:47 AM] |
||
|
|
alanb1951
Veteran Cruncher Joined: Jan 20, 2006 Post Count: 1317 Status: Offline Project Badges:
|
So, why doesn't the admins/techs stop these WU's from being sent out? Not a word from any of them..... Clearly they were trying something different, given the difference in work unit sizes, but with those errors on so many I wish they tried them first through the beta testing route. Now there's a lot of wasted computation as work units are sent to two different machines for validation for all those who lost their reliable status due to getting an error through no fault of their own.For what it's worth, it appears that all the failing batches are for a specific receptor, named 7aga_001_mgltools--LYS102. There are several other receptors of similar size in surrounding batches which don't seem to be throwing errors, so hopefully it's not going to be an ongoing problem as long as someone works out what's wrong with that batch. Should they Beta-test every single new receptor to avoid things like this? After all, what's going to constitute a big enough change to mean issues might be expected... It wouldn't be the first time we've had problems here when something was amiss in either data or parameters -- remember the misplaced grids that caused quite a lot of Invalid tasks in the second half of April 2021? As for stopping the WUs being sent out, I suspect that by the time sensible action could've been taken the entire batch was probably already out in the field - it wouldn't take long to queue retries on tasks that fail in seconds, and I suspect a lot of them were being sent back as errors within a couple of hours of receipt (all mine were!) I tend to agree about the potential loss of reliable status, but it is what it is... Cheers - Al P.S. Looking at the main data file for the problem batch, it is the only time I've ever seen HETATM data in an OPN1/OPNG receptor .pdbqt file. Probably a coincidence, but interesting... |
||
|
|
|