Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 68
Posts: 68   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 6901 times and has 67 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

mike, it's not a "choice". Usually if a work unit is only failing on some machines, and a hardware failure has been ruled out, then it will be some conflict with anti-virus software or something like that.

In this recent case, some have pointed the finger at some recent Windows updates, and others have blamed Symantec. It's too early to tell yet, I think, but with so many computers you and your team may have better luck spotting common factors. Have you installed any updates recently?

By the way, none of these results have been penalised for their claimed points. When that happens, the result is counted as valid for the purposes of validating the result, so a fourth work unit isn't sent.
[Nov 23, 2006 3:35:08 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mike047
Senior Cruncher
Joined: Aug 22, 2006
Post Count: 262
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

HI,
The box in question has run for over 3 weeks without know issue. The message log indicates no anomalies. There is no heat, random shutdown issues. This one is a dedicated cruncher and has had no updates and does not run any kind of firewall or anti virus utility. This unit had previously run QMC/Leiden for several months without issue. It is an Opteron on a good quality board.

Is there information, that will give me a clue to the issue, in the invald log. I don't know what all that stuff means.

So, it seems the wu is actually "defective" and is not invalid due to points claim. Am I understanding the process properly??

Just trying to get a handle on this to better understand and maybe fix the problem.

I have 3 boxes with invalid units, one on each and two on the example, out of 43 boxes that is fairly good smile

Would a power outage cause a wu completion to be invalid?? The power has gone out several times lately.
----------------------------------------
mike
Crunch Hard, Crunch Often


----------------------------------------
[Edit 1 times, last edit by mike047 at Nov 23, 2006 6:01:33 PM]
[Nov 23, 2006 6:00:26 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

Mikeo047,

when i look at your sample and the fact that a 4th copy was send, it's without doubt that the invalid was really invalid. If u look at the sequence, u returned at 16:00 and the 4th copy was send on 16:06 (the top of the 4)

BTW, that quorum is a superior example of varying CPU times, yet claims being within a very small range.

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Nov 23, 2006 6:14:11 PM]
[Nov 23, 2006 6:09:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

It could be the power. In theory, BOINC can withstand power failures and sudden shutdowns. In practice, sometimes things go wrong and files get corrupted.

When the WCG staff get back from Thanksgiving, I'll ask if the overall error rates have gone up, and for an update on the reported issues.

Do you use the BOINC screensaver, or view the graphics? That is the only recent change to the FAAH executable, and it did cause some issues.

Have you changed anything else in your operating environment recently? Any thoughts you have that will enable the WCG techs to reproduce the problem will be very helpful.

Is BOINC running any other projects besides WCG?
[Nov 23, 2006 6:17:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mike047
Senior Cruncher
Joined: Aug 22, 2006
Post Count: 262
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

I am running only WCG on this box at the present.

No screen saver, I just run it from boot and if I want to check it, open BOINC mgr. I didn't know you could view the graphics blushing .

No changes, I manage to keep a fairly good record[in a bound book] of my crunchers, problems and upgrades. Us old guys can't remember like we once did biggrin
----------------------------------------
mike
Crunch Hard, Crunch Often


[Nov 23, 2006 6:22:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

Well, we'll have to see what the techs can discover.

Meanwhile, you may want to investigate UPS options.
[Nov 23, 2006 6:25:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

One of my PCs is listed above, its the Sempron 2800+@2800Mhz.
Its computer ID is 71980.
I only had two invalid results so far. those were also within a few days.
I have not installed any windows updates, any additional applications and I am not running any firewall or anti virus software on that machine.

From my understanding I thought if a PC computes a error, due to being unstable or for whatever reason, the WU is being marked as computation error in the message tab.
So I assume that they are not computing errors during the computation, but there is still anything wrong with their validation.

Here are the result logs:

Result Log

<core_client_version>5.4.11</core_client_version>
<stderr_txt>
About to call graphics init
[DIAG] Crop rect (T,L - B,R): 48, 40 - 1199, 1193
Start Stage2 for Filter Bank #0
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #1
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #2
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #3
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #4
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #5
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #6
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #7
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
TMA finishing with return code: 0

</stderr_txt>

-------------------------------------------------------

Result Log

<core_client_version>5.4.11</core_client_version>
<stderr_txt>
About to call graphics init
[DIAG] Crop rect (T,L - B,R): 46, 13 - 1185, 1152
Start Stage2 for Filter Bank #0
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #1
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #2
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #3
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #4
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #5
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #6
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
Start Stage2 for Filter Bank #7
SetFilterMap
Finished deleteFilterMap
End SetFilterMap
TMA finishing with return code: 0

</stderr_txt>
[Nov 23, 2006 6:41:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mike047
Senior Cruncher
Joined: Aug 22, 2006
Post Count: 262
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

Well, we'll have to see what the techs can discover.

Meanwhile, you may want to investigate UPS options.


I guess that I will have to suffer biggrin
as I can't afford UPS for 43 boxes sad
----------------------------------------
mike
Crunch Hard, Crunch Often


[Nov 23, 2006 6:50:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

Just so people don't get the idea that we're being unresponsive, the WCG Tech team is aware of this thread. It is the Thanksgiving Holiday today, though, and most of the team is on vacation until Monday. We'll all have a closer look at the results at that time.

Thanks for providing the device ID's for the machines.

Have a happy holiday for those to whom it applies.
[Nov 23, 2006 6:52:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
mike047
Senior Cruncher
Joined: Aug 22, 2006
Post Count: 262
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem: Invalid Working Units in large numbers. Please help.

I have just noticed that a lot of these indicated invalids are from a similar time frame. 11-17 thru 11-20, friday thru monday of last weekend.

Any event here take place during that time frame??
----------------------------------------
mike
Crunch Hard, Crunch Often


[Nov 23, 2006 6:57:55 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 68   Pages: 7   [ Previous Page | 1 2 3 4 5 6 7 | Next Page ]
[ Jump to Last Post ]
Post new Thread