Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 51
Posts: 51   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 274746 times and has 50 replies Next Thread
Hypernova
Master Cruncher
Audaces Fortuna Juvat ! Vaud - Switzerland
Joined: Dec 16, 2008
Post Count: 1908
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

I just finished crunching HPFP2 and yes it is frustrating but I consider that it is difficult for tech to really check across all possible PC platforms and component hardware and software combinations existing around the globe. Only Microsoft has the capability to do that.
Of my ten devices only one crunches without generating any error except one here and there out of a hundred WU. The other devices have between 40% to 60% or more error rate. All my CPU's are Intel quadcore, all mainboards are X58 Asus, all memories are Patriot DDR3 and all OS Win7. So what the f@#*.!"?.
It is true that all errors happen between one and two minutes of crunching. Over my whole solar system that is (looking at results page) 2580 errors, at an average 1.5 minutes that is 3870 minutes or 64 hours runtime or nearly 3 days lost.
But again there are many systems in WCG less error prone, and those should be crunching HPFP2. wink
----------------------------------------

[Mar 14, 2010 10:47:50 AM]   Link   Report threatening or abusive post: please login first  Go to top 
pramo
Veteran Cruncher
USA
Joined: Dec 14, 2005
Post Count: 704
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
confused Re: Not Funny

I've been fortunate?

For HPF- I can see 589 results that are valid, pv or in progress, one error. The box with the single error has 90 valid and pv results so who knows what happened there confused

Across the board- that's the only error out of the last 930 results turned in (running DDD2,HCMD,HCC and HPF).

running everything from a W2000 900Mz athlon desktop to XP to WServer 2003 xeon 3.0Gz
----------------------------------------

[Mar 14, 2010 3:35:14 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Hypernova
Master Cruncher
Audaces Fortuna Juvat ! Vaud - Switzerland
Joined: Dec 16, 2008
Post Count: 1908
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

pramo, you're the perfect cruncher for HPFP2. applause Don't try to understand why. It is like talent. It's a gift from nature. You have got the HPFP2 crunching talent. wink

So please, crunch crunch crunch and never stop. biggrin
----------------------------------------

[Mar 15, 2010 6:59:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
kateiacy
Veteran Cruncher
USA
Joined: Jan 23, 2010
Post Count: 1027
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
shock Re: Not Funny

I just stumbled on this thread today -- decided to check the HPF2 forum as I've been receiving mostly HPF2 WUs the last few days. I guess I've been lucky too -- no errors so far on my first 18 or so WUs.

I'm running two machines -- one Core 2 Duo under Ubuntu 9.10, and an old single-processor Windows XP machine. Neither one seems to choke on HPF2. The WUs do take forever to validate with the required quorum of 15, but I guess that's needed if there are so many problems with WUs.
----------------------------------------

[Mar 21, 2010 2:27:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
rilian
Veteran Cruncher
Ukraine - we rule!
Joined: Jun 17, 2007
Post Count: 1452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

mp784_ 00047_ 9-- hostname Error 4/2/10 03:58:52 4/5/10 23:11:30 81.78 663.9 / 0.0

Yet another wasted 80 hours.


<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Maximum elapsed time exceeded
</message>
<stderr_txt>


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x77B70004

Engaging BOINC Windows Runtime Debugger...


after SAME ERRORS AN YEAR AGO this (last month) was my second attempt to continue crunching this projects. now im off

uplinger, if you'll make up a test WU app with debug on, i may run it on this host as many times as you need. Btw you can also release these WUs in Beta, distributing few per each OS type & OS version (knreed posted recently some info that this is possible)
----------------------------------------
----------------------------------------
[Edit 4 times, last edit by rilian at Apr 6, 2010 1:47:44 AM]
[Apr 6, 2010 1:35:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
RT
Master Cruncher
USA - Texas - DFW
Joined: Dec 22, 2004
Post Count: 2636
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
angry Re: Not Funny

Truths about this project:
1) These errors are years old with no resolution.
2) None of the other projects have anywhere near the same number of errors.


I, like many others, will not run it any more. Too much waste of my resources. I suspect that many people have been soured by the project so it is mostly people that do not run specific configurations e.g. (quads on windows) or newbies that suffer under the delusion that this project will support them in their effort to actually make a contribution.

My limited perspective suggests to me that this project comes in second in 'Projects That Discourage Distributed Computing'. Second only to the United Devices mess of years past.

It seems that the project masters are just trying to run it to the end without putting any more effort into it. Hopefully that end is not too far into the future and will not send very many more would-be contributors to other non-WCG projects or out of the distributed computing world altogether. sad
----------------------------------------
One of your friends in Texas cowboy
RT Website Hosting

[Apr 6, 2010 3:55:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Randzo
Senior Cruncher
Slovakia
Joined: Jan 10, 2008
Post Count: 339
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

It are really specific problems so do not blame techs or scientists.
If you see result status page you can see sometimes one error in 19 (quorum of HPF). So it are no wrong WU but really specific problems, which are hard to detect and solve.
[Apr 6, 2010 5:10:53 PM]   Link   Report threatening or abusive post: please login first  Go to top 
rilian
Veteran Cruncher
Ukraine - we rule!
Joined: Jun 17, 2007
Post Count: 1452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

It are really specific problems so do not blame techs or scientists.
If you see result status page you can see sometimes one error in 19 (quorum of HPF). So it are no wrong WU but really specific problems, which are hard to detect and solve.

that specific problem has:
1) known WU name
2) known host config and OS (current error was on Microsoft Windows Server 2008 Web Server x64 Edition, Service Pack 1, (06.00.6001.00))
3) known boinc version

one 80 hours waste per 19 results is a biggie
----------------------------------------
----------------------------------------
[Edit 2 times, last edit by rilian at Apr 6, 2010 7:39:13 PM]
[Apr 6, 2010 7:34:43 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Randzo
Senior Cruncher
Slovakia
Joined: Jan 10, 2008
Post Count: 339
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

Not so easy either.
There are many more circumpstances (AV settings, Boinc installation type and so on, OS patch version).
And why one HPF2 WU run more than 80 hours?
[Apr 6, 2010 8:40:54 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 2979
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Not Funny

rilian, although I can certainly understand your concerns at loosing over 80 hrs of crunching time to a WU that eventually errored out, I'd have been querying this long-running WU after 12. Especially as you tried with HPF2 last year, and apparently had issues with it running on your machine(s) back then.

Okay, you may not have been able to be physically next to the machine during all of this time, although under the circumstances, I'd have run a test HPF2 WU when you were available to monitor it.

As I'm sure you're aware, currently on WCG, the only project which has WU's that go over 24 hrs, are the DDDT2 "A" types, and thus, I'd CERTAINLY would have queried this WU at that time. Perhaps, this WU was one that, very occasionally, get's stuck in some sort of a loop, and by performing a suspend/restart, completes normally. It certainly falls outside of the normal HPF2 failures, where they fail within a matter of seconds from it's start.

Thus, as expressed above, I am sorry to hear that you've lost so much crunching time, not all of your anger can be directed at WCG.
----------------------------------------

[Apr 6, 2010 9:04:47 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 51   Pages: 6   [ Previous Page | 1 2 3 4 5 6 | Next Page ]
[ Jump to Last Post ]
Post new Thread