Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 23
Posts: 23   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 4527 times and has 22 replies Next Thread
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

The issue is possibly that a restart writes a few extra 'resume' bits. The validation process looks e.g. at a hash number to ensure all copies, each representing 1/10, make up the total as if a single result was received. If the hash deviates from the majority, that copy is resubmitted until it matches. If too many of the original set deviated, the whole set is resubmitted for 2nd opinion (looking at the or and times of a larger distribution set).

It's not that simple as even clean runs, per the result log, have turned into 'invalid' on a few done on my P4HT. As running in service, never got to see the graphics, so the observation report of the line graph certainly interesting.

Maybe knreed or the other techs could look at the invalids and see if the none critical bits are causing that anomaly?

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 3, 2008 10:26:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

With so many Invalid results occurring with ACH why don't the techs increase the number of initial replications from 10?

At least, as a temporary workaround for faster turnaround.
[Mar 4, 2008 4:26:04 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

Hello bijoalex,
I am sure the techs will do whatever they think is needed. But I have recently heard that the grad students will be working on their doctoral dissertations for this project after we return cycle 27 - the final 2 week period of the year. So I expect that AfricanClimate@Home will go on hiatus for an extended period until the next set of grad students take up the research project and refine the program in light of the initial results. So the time factor is probably not very important.

Lawrence
[Mar 4, 2008 7:23:51 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

That's an interesting insight. I thought the time factor is important as one cycle of ACH WUs can be sent out only after the previous ones are back after processing.

More over, if the grad students are waiting for the results to start their doctoral dissertations, wouldn't they want the results to be available as early as possible?
[Mar 4, 2008 2:33:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
applause Re: Invalid results

Maybe knreed or the other techs could look at the invalids and see if the none critical bits are causing that anomaly?


Seems that something changed in the validation process.
My above mentioned WU ach1_23_48_9-- has got a valid result now. It was on "inconclusive" after the first 10 results came back (+2 for errors) and another set of 10 WUs were sent out.
At the moment we are at 27 (!!!) replications but with astonishing 23 valid results (2 errors, 2 still in progress).

Greetings

Thorsten
[Mar 4, 2008 7:15:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

For what it's worth, I had one host BSoD last night -- it was crunching something else at the time but an ACH wu was waiting for its turn. After the post-crash cleanup I resumed BOINC and could soon see the bump in the graph. Figured the result would be screwed up, but it did validate. I have an earlier result from the same box that is still inconclusive, and has a bump due to a software-installation reboot.

So while resuming from checkpoint might frequently cause a validation problem, it appears not to do so 100% of the time. I guess that fits with some currently speculated causes.

[Edit: XP Pro, CC 5.10.30]
I'm curious about the "bump" -- it certainly does seem to to correlate with work unit restarts. Why might this happen?
----------------------------------------
[Edit 1 times, last edit by Former Member at Mar 4, 2008 7:27:58 PM]
[Mar 4, 2008 7:24:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

The image i have is that if first 10 without raw error are back, a test is applied and e.g. if 8 agree on the control hash, 2 will be send out for recomputation. Eventually if too many do not agree i.e the set is inconclusive, the whole set is resubmitted in one go. Indeed after return of those 10 it might come out that in fact the original majority was valid all along.

Anyway, the admin advised that the next phase will be done differently. The world of bandwidth may look upgraded again by end of this year or 2009 so the whole split might go different, similar to CPDN where e.g. large units are send out and trickles are returned.... nothing has developed how to approach the validation and distribution methodology for the next step.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 4, 2008 7:28:45 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

Seems that something changed in the validation process.
My above mentioned WU ach1_23_48_9-- has got a valid result now. It was on "inconclusive" after the first 10 results came back (+2 for errors) and another set of 10 WUs were sent out.
At the moment we are at 27 (!!!) replications but with astonishing 23 valid results (2 errors, 2 still in progress).

Maybe this is one of the effects of the new BOINC server code and/or misconfigured validators (see knreed's post here)
Especially this:
We apologize for this bug and we are installing corrected validators now.

Cheers
Thorsten
[Mar 4, 2008 8:55:44 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid results

Not really. Seen this a number of times in past where in the end a majority gets validated. As said the AC@H future is going to see a different way of distribution.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Mar 4, 2008 9:06:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
retsof
Former Community Advisor
USA
Joined: Jul 31, 2005
Post Count: 6824
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid results

I hope this helps to find the reason for such behaviour. But for now I would recommend to avoid any restarts of ACH workunits. Otherwise there is a high risk to produce an invalid result (at least on my machine).
I have had only one invalid, but I might have turned off the machine. I have tried to be careful to check for an AC@H after that and just let it run. Another thread mentions spontaneous restarts.
----------------------------------------
SUPPORT ADVISOR
Work+GPU i7 8700 12threads
School i7 4770 8threads
Default+GPU Ryzen 7 3700X 16threads
Ryzen 7 3800X 16 threads
Ryzen 9 3900X 24threads
Home i7 3540M 4threads50%
[Mar 11, 2008 3:28:21 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 23   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread