| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 32
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Over the last 24 hours we seem to be getting alot of results from many machines coming back as errors over the whole quorum - is there a problem with some of the latest FAAH WU's at the moment?
Here are just 3 examples - there are many many more: ![]() ![]() ![]() This of course now means that some of these machines are being restricted in the amount of work they are receiving due to the server 'fail-safe' kicking in. Please can the Techs take a look at this as we are losing a huge amount of credit for work completed, and also re-set our machines WU allowance so we can get the work we need. Cheers. |
||
|
|
rebirther
Cruncher Germany Joined: Nov 19, 2005 Post Count: 29 Status: Offline Project Badges:
|
I have found more with Genome Comparison. The last WUs I got it show many errors in list. Validator problem?
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If it helps - I'm getting the same FAAH errors here too - seems to be occuring from overnight 05.14.07 when the 3rd result in the quorum arrives and then it tries for validation.
I've lost quite a few now myself as a result ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
If it helps - I'm getting the same FAAH errors here too - seems to be occuring from overnight 05.14.07 when the 3rd result in the quorum arrives and then it tries for validation. I've lost quite a few now myself as a result ![]() Ady: I just checked the teams (XS_Team_Admin) account and we have over 23 PAGES of them going back to May9. I have some myself on the clovertown machine. All it does is FAAH units and never errors out. Movieman |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Jeeze, thats alot of work Dave
I hope we all get credited for these once its sorted out ![]() |
||
|
|
olympic
Senior Cruncher Joined: Jun 12, 2005 Post Count: 156 Status: Offline |
Same here, but only from the last 24 hours or so. The WU's seem to finish normally so hopefully this is just a problem with validation and all will be well once they run it again. I'm seeing errors across all projects including FAAH, GC and HPF2 so I'm confident the problem is with validation.
----------------------------------------![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Vaguely remember a bug about 3-4 weeks ago, where in fact the work units were not in 'error' and a validation rerun sorted it out, that is, if the Homogeneous Redundancy distribution logic did not break.
----------------------------------------It's night at the office regrettably, so we have to wait it out till the technicians get in. Added: Just checked that all work for a HPF2 job has been turning to error i.e. when the quorum 15 was reached. That project does not use the HR rule.
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 4 times, last edit by Sekerob at May 14, 2007 10:00:51 AM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
If you got loads of work in Task Buffer, propose to suspend network and crunch on. This way, you would not establish a bad client record..... I'm doing so as I got 2 days worth.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yeah just hit one of my HPF2 - have suspended the network...
----------------------------------------The group credit for this WU had already been given, so it was when it tried to validate my result against all the others already validated.... [Edit 1 times, last edit by Former Member at May 14, 2007 10:31:50 AM] |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
We are investigating now - I cannot logon to one of our servers.
In the meantime, I have disabled the schedulers so that no more work is returned until we have resolved the problem. |
||
|
|
|