| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 68
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
mike, it's not a "choice". Usually if a work unit is only failing on some machines, and a hardware failure has been ruled out, then it will be some conflict with anti-virus software or something like that.
In this recent case, some have pointed the finger at some recent Windows updates, and others have blamed Symantec. It's too early to tell yet, I think, but with so many computers you and your team may have better luck spotting common factors. Have you installed any updates recently? By the way, none of these results have been penalised for their claimed points. When that happens, the result is counted as valid for the purposes of validating the result, so a fourth work unit isn't sent. |
||
|
|
mike047
Senior Cruncher Joined: Aug 22, 2006 Post Count: 262 Status: Offline Project Badges:
|
HI,
----------------------------------------The box in question has run for over 3 weeks without know issue. The message log indicates no anomalies. There is no heat, random shutdown issues. This one is a dedicated cruncher and has had no updates and does not run any kind of firewall or anti virus utility. This unit had previously run QMC/Leiden for several months without issue. It is an Opteron on a good quality board. Is there information, that will give me a clue to the issue, in the invald log. I don't know what all that stuff means. So, it seems the wu is actually "defective" and is not invalid due to points claim. Am I understanding the process properly?? Just trying to get a handle on this to better understand and maybe fix the problem. I have 3 boxes with invalid units, one on each and two on the example, out of 43 boxes that is fairly good Would a power outage cause a wu completion to be invalid?? The power has gone out several times lately.
mike
----------------------------------------Crunch Hard, Crunch Often [Edit 1 times, last edit by mike047 at Nov 23, 2006 6:01:33 PM] |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Mikeo047,
----------------------------------------when i look at your sample and the fact that a 4th copy was send, it's without doubt that the invalid was really invalid. If u look at the sequence, u returned at 16:00 and the 4th copy was send on 16:06 (the top of the 4) BTW, that quorum is a superior example of varying CPU times, yet claims being within a very small range. cheers
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 2 times, last edit by Sekerob at Nov 23, 2006 6:14:11 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
It could be the power. In theory, BOINC can withstand power failures and sudden shutdowns. In practice, sometimes things go wrong and files get corrupted.
When the WCG staff get back from Thanksgiving, I'll ask if the overall error rates have gone up, and for an update on the reported issues. Do you use the BOINC screensaver, or view the graphics? That is the only recent change to the FAAH executable, and it did cause some issues. Have you changed anything else in your operating environment recently? Any thoughts you have that will enable the WCG techs to reproduce the problem will be very helpful. Is BOINC running any other projects besides WCG? |
||
|
|
mike047
Senior Cruncher Joined: Aug 22, 2006 Post Count: 262 Status: Offline Project Badges:
|
I am running only WCG on this box at the present.
----------------------------------------No screen saver, I just run it from boot and if I want to check it, open BOINC mgr. I didn't know you could view the graphics .No changes, I manage to keep a fairly good record[in a bound book] of my crunchers, problems and upgrades. Us old guys can't remember like we once did ![]()
mike
Crunch Hard, Crunch Often |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Well, we'll have to see what the techs can discover.
Meanwhile, you may want to investigate UPS options. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
One of my PCs is listed above, its the Sempron 2800+@2800Mhz.
Its computer ID is 71980. I only had two invalid results so far. those were also within a few days. I have not installed any windows updates, any additional applications and I am not running any firewall or anti virus software on that machine. From my understanding I thought if a PC computes a error, due to being unstable or for whatever reason, the WU is being marked as computation error in the message tab. So I assume that they are not computing errors during the computation, but there is still anything wrong with their validation. Here are the result logs: Result Log <core_client_version>5.4.11</core_client_version> <stderr_txt> About to call graphics init [DIAG] Crop rect (T,L - B,R): 48, 40 - 1199, 1193 Start Stage2 for Filter Bank #0 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #1 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #2 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #3 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #4 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #5 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #6 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #7 SetFilterMap Finished deleteFilterMap End SetFilterMap TMA finishing with return code: 0 </stderr_txt> ------------------------------------------------------- Result Log <core_client_version>5.4.11</core_client_version> <stderr_txt> About to call graphics init [DIAG] Crop rect (T,L - B,R): 46, 13 - 1185, 1152 Start Stage2 for Filter Bank #0 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #1 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #2 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #3 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #4 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #5 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #6 SetFilterMap Finished deleteFilterMap End SetFilterMap Start Stage2 for Filter Bank #7 SetFilterMap Finished deleteFilterMap End SetFilterMap TMA finishing with return code: 0 </stderr_txt> |
||
|
|
mike047
Senior Cruncher Joined: Aug 22, 2006 Post Count: 262 Status: Offline Project Badges:
|
Well, we'll have to see what the techs can discover. Meanwhile, you may want to investigate UPS options. I guess that I will have to suffer as I can't afford UPS for 43 boxes ![]()
mike
Crunch Hard, Crunch Often |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Just so people don't get the idea that we're being unresponsive, the WCG Tech team is aware of this thread. It is the Thanksgiving Holiday today, though, and most of the team is on vacation until Monday. We'll all have a closer look at the results at that time.
Thanks for providing the device ID's for the machines. Have a happy holiday for those to whom it applies. |
||
|
|
mike047
Senior Cruncher Joined: Aug 22, 2006 Post Count: 262 Status: Offline Project Badges:
|
I have just noticed that a lot of these indicated invalids are from a similar time frame. 11-17 thru 11-20, friday thru monday of last weekend.
----------------------------------------Any event here take place during that time frame??
mike
Crunch Hard, Crunch Often |
||
|
|
|