Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 37
Posts: 37   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 5567 times and has 36 replies Next Thread
dividedbymyself
Cruncher
Joined: Aug 10, 2008
Post Count: 43
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Invalid result after 100% completion

Hi,
I've already returned several results that get reported as invalid in my result status while they seem to finish without error in Boinc, though some curious things happen before that.
First of all, my computer is pretty slow, 1300Mhz, 386Mb. An average HPF2 Wu runs for about 25 hours to finish. I've got the duration time set to 2 hour.
When a task is almost finished and already shows 100% it stops and waits for another turn. I read that's because of a checkpoint near the end. After it starts again it takes just a few moments for it to complete and the result is returned to the server. But when I check the result it is marked as invalid.

Could the invalid result have anything to do with the way the Wu ends as I described above? If so, is there something I can do to prevent this from happening? Or is my crunch box just too slow? Any suggestions?

I really don't mind about the credits I loose because of this, but loosing 25 hours per Wu is quite a lot when many of my results have nothing to add to the project.

Bart
[Jan 22, 2009 8:25:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

Hello dividedbymyself,
From your discussion of duration, it sounds as though you are running some non-WCG projects as well. The first thing that pops into mind is that you might be running Vista 64. It has been reported to give problems with HPF2: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=22661

In any event, the first thing to do is to give us more information. If you could post Messages that will give us an idea about your system.

Lawrence
[Jan 22, 2009 8:51:04 PM]   Link   Report threatening or abusive post: please login first  Go to top 
dividedbymyself
Cruncher
Joined: Aug 10, 2008
Post Count: 43
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Invalid result after 100% completion

Hi Lawrence,

I am running several other (non WCG) Boinc projects as well on the same computer.
The OS is WinXP-SP3, on a AMD Athlon (K3 if I remember well) 32 bit CPU. Memory is 384 Mb btw, not 386 ;)

Need more info?

Bart
----------------------------------------
[Edit 1 times, last edit by dividedbymyself at Jan 22, 2009 9:40:16 PM]
[Jan 22, 2009 9:39:22 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

Hi dividedbymyself,
On your Results Status page, if you click on 'Invalid' for a work unit, you will get a Result Log for that work unit. Is there a difference between valid and invalid results? Also, are you getting any Valid results for WCG projects? If so, is there any pattern, such as all HPF2 jobs fail but FAAH always succeed? For that matter, when was the last time a HPF2 job was marked Valid? (Not a trick question. I am just trying to find out if this is an intermittent problem or one that always happens or one that always happens if and only if you switch projects after reaching 100% but before returning the results.)

Lawrence
[Jan 22, 2009 10:58:24 PM]   Link   Report threatening or abusive post: please login first  Go to top 
dividedbymyself
Cruncher
Joined: Aug 10, 2008
Post Count: 43
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

I only have two WU's in my history, only one from HPF2 and another from TCEP that's running now on my other computer, all the rest have disappeared already.

This is what the last HPF2 result shows:

<core_client_version>6.2.19</core_client_version>
<![CDATA[
<stderr_txt>
called boinc_finish
called boinc_finish

</stderr_txt>
]]>

There seems to be no error here, but maybe you see things different?

Most HPF2 results were fine in the past but I do not regularly check the results, so to be honest I can't tell how many were invalid, but I don't remember there were many errors in the past when I did check. What I do know is that of the last 3, two were invalid. I think they were all returned this month.
On my other computer (WinXP SP3 on AMD-K6 64) I currently run TCEP and I remember to have had only one failure/invalid result, the first one, and the rest was OK, but also no history of that anymore, so I can't be 100% sure.

I'm sorry I can't give much more information, but the history page is not very extended and because I don't check the result page very often I just don't remember them all.

Bart
[Jan 22, 2009 11:57:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

Hi dividedbymyself,
I will compare with my HPF2 result logs once the statistics finish updating. I don't remember for sure, but I think that a double call to boinc_finish is unusual. It may turn out to be a problem with switching after 100%.

Lawrence
[Jan 23, 2009 12:26:02 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

Just checked. All 3 of my HPF2 results call boinc_finish just once. So I am going to guess that the problem occurred because BOINC switched to a different project after reaching 100%. I remember somebody else who had a problem at the same point, but I cannot remember what project he was running or just what the problem was.

I don't have a solution. Just from curiosity, have you checked 'Leave in memory' in your profile?

Lawrence
[Jan 23, 2009 1:15:56 AM]   Link   Report threatening or abusive post: please login first  Go to top 
dividedbymyself
Cruncher
Joined: Aug 10, 2008
Post Count: 43
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

Just from curiosity, have you checked 'Leave in memory' in your profile?


I don't have that option checked as I think it consumes too much memory as I'm crunching 5 to 8 projects on that computer, dependent on the availability of work.

But out of curiosity... Why can't the checkpoint not just get skipped at the end of a Wu? Finished is finished I suppose.
I have to assume a lot about the inner workings of the app and over simplifying too much, but I assume that there's a checkpoint at the end of some sort of loop to enable intermediate results to be written to file so the app can restart from there the next time it loads again. But when it knows the Wu is finished, it can also be told to skip the 100% checkpoint and write to file by loop-independent means and then send the results back to base.

But you think that leaving the app in memory could resolve the issue? Wouldn't it affect my available memory as there are going to be at least 5 apps in memory, dependent on available work and how often I restart Boinc or reboot?
Btw, I reboot daily, so in the end it could still effect HPF2.

Hmm, lots of questions here... And as far as I can see there's not a fail-safe solution.

Bart
[Jan 23, 2009 10:33:39 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

The project default switch is 60 minutes. You're much better served with e.g. 240 minutes. Chance of the scheduler allowing the task to finish and pack it up is much greater what I think is a much to eager pre-emptive scheduling flaw.

Proposed a test to say e.g. if > 99.5% done, let it complete and pack up for transmission, but doubt it was heard by the developers. They have very selective hearing actually, because it was multiple times reported on their forums and possible a Trac ticket has been existing for longer.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jan 23, 2009 1:01:09 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Invalid result after 100% completion

Hi dividedbymyself,
In your situation, I would not leave anything in memory either. The correct way to handle this problem is to change BOINC so that it does not switch projects when it is so near the end. I do not expect this correction to be made in the near future with so much else being done to BOINC. So - - just another extremely low-occurrence bug to put up with.

Lawrence
[Jan 23, 2009 2:45:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 37   Pages: 4   [ 1 2 3 4 | Next Page ]
[ Jump to Last Post ]
Post new Thread