Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 10
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 1839 times and has 9 replies Next Thread
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Invalid results on Linux / AMD

After a couple of weeks / months without significant troubles, one of my hosts experienced over 25 invalid results mostly with the batch SCC1_0000124_Lin-CSD-A (only a few invalid WUs with batches 118, 120, 123, 125).
As usual with Vina-based projects, it is an AMD Athlon II x4 CPU.
I assume, it is probably an affinity problem with my wingmen.
Cheers,
Yves
----------------------------------------
[Mar 6, 2017 5:30:01 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Eric_Kaiser
Veteran Cruncher
Germany (Hessen)
Joined: May 7, 2013
Post Count: 1047
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

I have two AMD Kabini running SCC ob Linux since the start if this project. No imvalids in my side.
----------------------------------------

[Mar 7, 2017 5:14:17 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

Hi Eric,
it can go well for weeks (sometime months) and suddenly there are a lot of invalid WUs at result validation without any changes on the host. Typically it occurs with Vina-based projects.
Cheers,
Yves
----------------------------------------
[Mar 7, 2017 11:55:49 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

If a system runs BOINC it's not a mission-critical system that can't be booted, and I say that with a period. At the very least I'd cycle the client with the sudo boinc-client -restart on the command line when this starts happening.
[Mar 7, 2017 12:08:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Eric_Kaiser
Veteran Cruncher
Germany (Hessen)
Joined: May 7, 2013
Post Count: 1047
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

Yves, I have never seen this behaviour on my AMDs.
I have them running for three years now 24/7 on FAH, OET, ZIKA and SCC.
I check a couple times a week for invalids on all my computers.
Except the SCC for Android app I have a very very low rate of invalids (close to zero).
----------------------------------------

[Mar 7, 2017 12:22:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
AgrFan
Senior Cruncher
USA
Joined: Apr 17, 2008
Post Count: 366
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

I have a AMD Athlon II x4 640 running VINA work perfectly fine. No invalids here.
Ubuntu Server 14.04 LTS 64-bit
Linux 4.4.0-53-generic
4GB RAM
----------------------------------------
[Edit 2 times, last edit by AgrFan at Mar 7, 2017 12:58:12 PM]
[Mar 7, 2017 12:54:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
flynryan
Senior Cruncher
United States
Joined: Aug 15, 2006
Post Count: 235
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

No invalids or errors on my 56 AMD/Linux cores.
[Mar 7, 2017 2:25:34 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

I run two AMD Athlon II x4 (640) systems with Linux Mint 17.2 (still kernel 3.16). But the both CPUs have not been purchased at the same time; i.e. I suspect a CPU mask difference.
This specific host is the only one experiencing from time to time such waves of invalid results with Vina since the Vina launch for supporting sciences. Results become invalid during the validation, no error during computation. After a couple of hours or days it is quiet again. Even during the period with invalid results, there are many valid results as well. It is the reason why I suspect an "affinity" problem between wingmen.
Since the reason of the "invalidity" is not mentioned I can only make some assumptions.
If the tech-team could send me by e-mail the reason for the missing validation, it would maybe help to understand the cause(s) of this recurring problem.
Cheers,
Yves
----------------------------------------
[Mar 7, 2017 8:31:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sgt.Joe
Ace Cruncher
USA
Joined: Jul 4, 2006
Post Count: 7581
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

Perhaps you have already thought of this, but could heat be a transient problem with the one system ? The other thought that comes to mind is maybe a motherboard problem with a weak capacitor. Has that system ever been subject to a transient voltage spike ? Other than these, I would have no clue.
Cheers
----------------------------------------
Sgt. Joe
*Minnesota Crunchers*
[Mar 7, 2017 11:40:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1671
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results on Linux / AMD

Hi Sgt. Joe,
I did consider this point as well. The both Athlon II x4 are in my office (with clean power) side by side. Since without any action the host generates valid results as well and "recover" for weeks or months, I do not think that it could be the reason. Likewise, I do not think that a possible RAM defect could cause this particular behaviour. My assumption is that there is a CPU mask difference. However, as mentioned, without knowing the reason for the failed validation, I cannot solve the problem.
Cheers,
Yves
----------------------------------------
[Mar 9, 2017 7:40:18 AM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread