Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 26
Posts: 26   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2391 times and has 25 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

Not so, RickH. Due to the potential for inconsistencies, WCG use a different work unit pool for each platform: Windows, Linux and Mac.

So, any Invalid results that you get now for new work units are interesting to the tech team. This project really does seem to be a baptism of fire for Rosetta. As far as I know, never before have so many high resolution proteins been folded. A few issues are inevitable. The quantity we have been experiencing is unfortunate, but it should settle down soon. I hope.
[Jul 8, 2006 8:39:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

RickH, my HT machine is not extreme, plus alternates between FAAH & HPF2, whatever WCG sends. My log shows 29 HPF2 completed of which 2 invalid (5.06 from early days). Not seen a single 1 bombing or hanging permanently, sometimes sticking in one percentage spot, but CPU counter uninterrupted until 100%. Sometimes it will just sit there for longer time at the end, probably doing some indexing of the result and prepping it for transmittal barely using CPU time. Last few days even figured out to uptimize for HT, running UD+BOINC simultaneous......just happy as a clam, so guess i'm fortunate.

On your hi probability observation with the XP platform, I believe to have read that WU's send / the results returned for different platforms are not mixed i.e. macs go to macs only, linux to linux, win to win. Stand to be corrected on that. ***

Coming BOINC version has CPU instruction set recognition, so it will allow even more precise distribution and result matching.

Errata: As i was writing, now see Didactylos was faster to hit the send button....at least its not a factor in the inconclusive mix.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Jul 8, 2006 9:29:52 PM]
[Jul 8, 2006 8:44:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

So, any Invalid results that you get now for new work units are interesting to the tech team.


Whoa, so I should be reporting every Invalid result that was generated with 5.07 (say, all WUs first issued in July, to be safe)? That's an awful lot of results; about a third of my results end up Invalid, and out of the ones that end up Valid, I see something like half have an Invalid or two from someone else logged.

Instead of my reporting all those, you may as well just watch for my Host ID and check every unit I crunch, since most of them are apparently "interesting" (in the Chinese curse sense).

Blecch. Is this just me, then? I thought everyone was still seeing a lot of Invalids.
[Jul 8, 2006 9:01:58 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

One of the techs did mention that some issues were caused by problems on the client, not with the work unit. You should be prepared to learn that it's something you have to fix yourself.

Might not be, though. We're not likely to get much further until after the weekend.

And you're right - the techs can easily mine the database for invalid results. What they are most interested in are stalled or abortive work units, and what happened when they died.
[Jul 8, 2006 9:17:31 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

Let me qualify.....i'm seeing invalids from others on WU's i've crunched, but not many.....and as said, have only been marked with 2 which were certainly 5.06 when i crunched them.

Yesterday i got 2 send that had 3 done, marked inconclusive.....put them ahead in my queue and had immediate validation, so it looks to me things are stabilising.

PS, only WCG techs can monitor your Host ID......meantime, i dont know if you run more machines, but i'd be worried about your machine

PS: Didactylus, found where you come from....its not from 'behind' a Stargate dancing

μηνιν άειδε θεά Πηληϊάδεω άχιληος ουλομένην, η μυρί' άχαιοις άλγε'
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 2 times, last edit by Sekerob at Jul 8, 2006 9:46:49 PM]
[Jul 8, 2006 9:25:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

Well, standard defensive "it can't be ME" reaction aside, this machine is rock solid. Prime95 stable for 24+ hours, never got Invalids or Errors from HPF1, FAAH or other DC work.

I would argue that HPF2 should be expected to work on any system which can crunch other BOINC and similar DC projects for weeks or months without errors.

If HPF2 has managed to find some obscure thing that affects only it, on systems that work flawlessly otherwise, then pragmatically speaking, it's still going to end up as HPF2's problem to cope with.

If it turns out that on AMD X2 systems with socket 939 and PC4000 RAM the floating point FUBAR instruction (used only in HPF2) gives results which are wrong in the 15th decimal place, HPF2 will have to find a way to work around it.

Of course, if it turns out that only my system has such an error, then I'll just have to switch back to FAAH or something. There's no way I can find or fix such an obscure thing, when the machine runs perfectly otherwise and I have no other clues.
----------------------------------------
[Edit 1 times, last edit by Former Member at Jul 9, 2006 1:18:02 AM]
[Jul 8, 2006 10:05:48 PM]   Link   Report threatening or abusive post: please login first  Go to top 
olympic
Senior Cruncher
Joined: Jun 12, 2005
Post Count: 156
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

I have 2 dual-core AMD Opteron 939 machines crunching with BOINC and they continued to throw out invalids with Rosetta 5.07. I'm guessing about 1/3 of all results returned turned out invalid. They are both overclocked but passed all the standard stability tests and never had a problem with HPF1, FAAH, etc. So what I have done is switched to FAAH only for a while until all the inconclusives grind their way through the system. At that point I'll crunch some more HPF2's(maybe 10-20 WU's) and see what happens. Maybe by then the bugs will be found and squashed. ;)
----------------------------------------

[Jul 9, 2006 7:08:43 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

AMD 939 Socket times 3 is not a great statistical value, but if WCG have the capability to analyse which processors cause the large fallout, they will..... Think WCG knows what CPU it sends to or receives from in reasonable detail. Soon it will optimise as it will also be looking at the instruction sets of CPU's (BOINC Only?).

My P4 2.53 HT has done now 30 odd HPF2 of which 2 invalids on 5.06, and rest valids, all 5.07x, 1 is pending validation. Not many but confirms to me my platform stability.....i.e got 100% on HPF2 5.07x.

.....since my last re-image i boot only when required by software updates....that's now well over 7x24 ago, maybe come 'critical patch' Tuesday again. Used to have a BSOD every 3rd day, due a very hi memory address intermittent.

PS FUBAR, i'm not too familiar with american acronyms but recollect something that sounded like foobar wink
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jul 9, 2006 12:46:23 PM]
[Jul 9, 2006 9:24:25 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

Well done, Sekerob.

It is of course the first few words of the Iliad (lacking a couple of pothooks). The reference is to our beloved leader, J. D. "Illiad" Frazer - creator of UF.
[Jul 9, 2006 9:57:41 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: New to BOINC - Error or Invalid status codes?

I got one of those errors recently and result just returned to WCG as follows:

Device Name: BlackBart00 Team ID: Boinc 10296 Acct Nr: 205644

Project Applic Name Report deadline Status

WCG hpf2 5.07 za110_00549_0 7/15/2006 10: Computation Error
7/9/2006 4:59:13AM Unrecoverable error for result za110_00549_0 (The environment is incorrect. (0xa)-exit code 10 (0xa))

hope this helps
PaulT
[Jul 9, 2006 12:39:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 26   Pages: 3   [ Previous Page | 1 2 3 | Next Page ]
[ Jump to Last Post ]
Post new Thread