| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 9
|
|
| Author |
|
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
I brought this up in the server upgrade thread but it quickly got buried and ignored.
----------------------------------------Several results are going to error status when passing through the first validation check instead of being marked inconclusive. CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 3-- - In Progress 3/19/12 17:09:40 3/25/12 17:09:40 0.00 0.0 / 0.0 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 2-- 640 Error 3/18/12 21:32:51 3/19/12 15:04:25 10.71 53.0 / 0.0 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 0-- 640 Error 3/15/12 12:47:54 3/18/12 21:28:15 3.60 62.0 / 0.0 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 1-- 640 Inconclusive 3/15/12 12:47:31 3/15/12 21:16:12 4.04 88.5 / 0.0 Is a very good example of the problem. The two results with error status show a normal completion log. If the techs could look into this, I would appreciate it. ![]() Distributed computing volunteer since September 27, 2000 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
What version of BOINC are you running?? 3rd party version or the latest one for WCG? 6.10.58 is the latest for WCG.. if you are using a older version or one Tailored for another project you work with.. errors can happen with some projects...
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Have seen that both originals were PV and then when the validator gets to it, 1 going error and the other going inconclusive, where it's in my opinion preferred to leave the status Pending Validation on the provisionally good one. It's seen irrespective of version [quorum participant clients can be checked]. The agents do either flat break the science app/task, that does the work, or just do the task traffic managing... 6.10.58 eliminates any doubt there may be.
4th party btw, as the Berkeley recommended are mostly fine too [sometimes they're too quick declaring a new point release for general production ready]. I'm surprised though to read that YUM of various Linux distros actually has the version 7 alpha clients in it. That's flat foolish... use at own risk. I'm testing and have but for one documented case in the first hour of the new server software going live, not had a single bad result, and it cannot even be proven it was the client causing this. --//-- |
||
|
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
What version of BOINC are you running?? 3rd party version or the latest one for WCG? 6.10.58 is the latest for WCG.. if you are using a older version or one Tailored for another project you work with.. errors can happen with some projects... Clearly people misunderstood this post. These work units are not erroring. They are not all mine, many times it is the wingman assigned error status. The problem is with the server code and needs to be looked at from that side. If they're invalid, they need to be marked as such, but I suspect they aren't even that. I remember one unit in particular where the only two results returned, both with normal finishes, were marked error. If a substantial percentage of the work (and from what I can tell this is running about 2%) is being reported in error erroneously, then there is a tremendous waste of CPU power across the grid. ![]() Distributed computing volunteer since September 27, 2000 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
How many "people" were there responding? Yes, the thought had occurred before in the other thread that these results were erroneously rated to have failed. That would be a science specific validator rule, not necessarily widespread. Let's see what techs have to say first.
--//-- |
||
|
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
And it may be something specific to HCMD2. I certainly haven't seen this behavior prior to the server code upgrade. In the other thread my reply followed a totally different concern and was probably just skipped over.
----------------------------------------As long as this has reached the attention of the techs, I'll be happy. As a CA you have the power to bring it to their attention. When one is losing ~72 hours per day to a bug it's difficult to consider it as trivial. Also, I would never consider using a 7.x client. I'm dissatisfied enough with the 6.12.x series. ![]() Distributed computing volunteer since September 27, 2000 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Of course with "just skipped over" and "ignored" comments, you may consider opening a support mail. Works for me when there is good reason to use that pathway... 72 hours/day is not chicken little.
----------------------------------------In the near 1000 results I've completed since server 7 up, I've seen 2 inconclusive for a quorum 2, most completed for SN2S, so I'm optimistic. --//-- edit: And seeing a tech reading this thread, consider it reported up the ladder. [Edit 1 times, last edit by Former Member at Mar 20, 2012 5:17:49 PM] |
||
|
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
Including a couple more results for comparison. These are on both Ubuntu 11.10 and Windows 7
----------------------------------------https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=418201304 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 3-- 640 Valid 3/19/12 17:09:40 3/20/12 11:25:40 5.49 78.0 / 83.3 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 2-- 640 Error 3/18/12 21:32:51 3/19/12 15:04:25 10.71 53.0 / 0.0 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 0-- 640 Error 3/15/12 12:47:54 3/18/12 21:28:15 3.60 62.0 / 0.0 CMD2_ 2243-1C1Y_ B.clustersOccur-1ZG2_ A.clustersOccur_ 0_ 1-- 640 Valid 3/15/12 12:47:31 3/15/12 21:16:12 4.04 88.5 / 83.3 https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=419359568 CMD2_ 2175-1NZW_ A.clustersOccur-3DPT_ B.clustersOccur_ 45_ 65905_ 66049_ 2-- 640 Pending Validation 3/16/12 23:12:08 3/17/12 18:08:00 4.11 97.3 / 0.0 CMD2_ 2175-1NZW_ A.clustersOccur-3DPT_ B.clustersOccur_ 45_ 65905_ 66049_ 1-- 640 Error 3/16/12 18:07:30 3/16/12 22:31:40 0.79 28.4 / 0.0 CMD2_ 2175-1NZW_ A.clustersOccur-3DPT_ B.clustersOccur_ 45_ 65905_ 66049_ 0-- - In Progress 3/16/12 18:07:26 3/28/12 18:07:26 0.00 0.0 / 0.0 https://secure.worldcommunitygrid.org/ms/devi...s.do?workunitId=418969170 CMD2_ 2249-1BBN_ A.clustersOccur-1RW6_ A.clustersOccur_ 3_ 3-- 640 Valid 3/19/12 05:25:33 3/19/12 22:07:14 5.24 67.4 / 204.1 CMD2_ 2249-1BBN_ A.clustersOccur-1RW6_ A.clustersOccur_ 3_ 2-- 640 Valid 3/19/12 05:25:29 3/20/12 00:45:51 7.63 340.8 / 204.1 CMD2_ 2249-1BBN_ A.clustersOccur-1RW6_ A.clustersOccur_ 3_ 1-- 640 Error 3/15/12 22:52:06 3/17/12 15:47:22 4.82 308.6 / 0.0 CMD2_ 2249-1BBN_ A.clustersOccur-1RW6_ A.clustersOccur_ 3_ 0-- 640 Error 3/15/12 22:51:45 3/19/12 03:48:54 6.18 142.1 / 0.0 ![]() Distributed computing volunteer since September 27, 2000 |
||
|
|
KWSN - A Shrubbery
Master Cruncher Joined: Jan 8, 2006 Post Count: 1585 Status: Offline |
Haven't seen any wrongfully go to error status in the last 24 hours and I only lost 39.5 hours the 24 hours prior to that. Looks like it has either been fixed or moved on by itself.
----------------------------------------![]() Distributed computing volunteer since September 27, 2000 |
||
|
|
|