Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
![]() |
World Community Grid Forums
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 51
|
![]() |
Author |
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I just finished crunching HPFP2 and yes it is frustrating but I consider that it is difficult for tech to really check across all possible PC platforms and component hardware and software combinations existing around the globe. Only Microsoft has the capability to do that.
----------------------------------------Of my ten devices only one crunches without generating any error except one here and there out of a hundred WU. The other devices have between 40% to 60% or more error rate. All my CPU's are Intel quadcore, all mainboards are X58 Asus, all memories are Patriot DDR3 and all OS Win7. So what the f@#*.!"?. It is true that all errors happen between one and two minutes of crunching. Over my whole solar system that is (looking at results page) 2580 errors, at an average 1.5 minutes that is 3870 minutes or 64 hours runtime or nearly 3 days lost. But again there are many systems in WCG less error prone, and those should be crunching HPFP2. ![]() ![]() |
||
|
pramo
Veteran Cruncher USA Joined: Dec 14, 2005 Post Count: 704 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I've been fortunate?
----------------------------------------For HPF- I can see 589 results that are valid, pv or in progress, one error. The box with the single error has 90 valid and pv results so who knows what happened there ![]() Across the board- that's the only error out of the last 930 results turned in (running DDD2,HCMD,HCC and HPF). running everything from a W2000 900Mz athlon desktop to XP to WServer 2003 xeon 3.0Gz ![]() |
||
|
Hypernova
Master Cruncher Audaces Fortuna Juvat ! Vaud - Switzerland Joined: Dec 16, 2008 Post Count: 1908 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
pramo, you're the perfect cruncher for HPFP2.
----------------------------------------![]() ![]() So please, crunch crunch crunch and never stop. ![]() ![]() |
||
|
kateiacy
Veteran Cruncher USA Joined: Jan 23, 2010 Post Count: 1027 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I just stumbled on this thread today -- decided to check the HPF2 forum as I've been receiving mostly HPF2 WUs the last few days. I guess I've been lucky too -- no errors so far on my first 18 or so WUs.
----------------------------------------I'm running two machines -- one Core 2 Duo under Ubuntu 9.10, and an old single-processor Windows XP machine. Neither one seems to choke on HPF2. The WUs do take forever to validate with the required quorum of 15, but I guess that's needed if there are so many problems with WUs. ![]() |
||
|
rilian
Veteran Cruncher Ukraine - we rule! Joined: Jun 17, 2007 Post Count: 1452 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
mp784_ 00047_ 9-- hostname Error 4/2/10 03:58:52 4/5/10 23:11:30 81.78 663.9 / 0.0
----------------------------------------Yet another wasted 80 hours. <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> Maximum elapsed time exceeded </message> <stderr_txt> Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x77B70004 Engaging BOINC Windows Runtime Debugger... after SAME ERRORS AN YEAR AGO this (last month) was my second attempt to continue crunching this projects. now im off uplinger, if you'll make up a test WU app with debug on, i may run it on this host as many times as you need. Btw you can also release these WUs in Beta, distributing few per each OS type & OS version (knreed posted recently some info that this is possible) ---------------------------------------- [Edit 4 times, last edit by rilian at Apr 6, 2010 1:47:44 AM] |
||
|
RT
Master Cruncher USA - Texas - DFW Joined: Dec 22, 2004 Post Count: 2636 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Truths about this project:
----------------------------------------1) These errors are years old with no resolution. 2) None of the other projects have anywhere near the same number of errors. I, like many others, will not run it any more. Too much waste of my resources. I suspect that many people have been soured by the project so it is mostly people that do not run specific configurations e.g. (quads on windows) or newbies that suffer under the delusion that this project will support them in their effort to actually make a contribution. My limited perspective suggests to me that this project comes in second in 'Projects That Discourage Distributed Computing'. Second only to the United Devices mess of years past. It seems that the project masters are just trying to run it to the end without putting any more effort into it. Hopefully that end is not too far into the future and will not send very many more would-be contributors to other non-WCG projects or out of the distributed computing world altogether. ![]() |
||
|
Randzo
Senior Cruncher Slovakia Joined: Jan 10, 2008 Post Count: 339 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It are really specific problems so do not blame techs or scientists.
If you see result status page you can see sometimes one error in 19 (quorum of HPF). So it are no wrong WU but really specific problems, which are hard to detect and solve. |
||
|
rilian
Veteran Cruncher Ukraine - we rule! Joined: Jun 17, 2007 Post Count: 1452 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It are really specific problems so do not blame techs or scientists. If you see result status page you can see sometimes one error in 19 (quorum of HPF). So it are no wrong WU but really specific problems, which are hard to detect and solve. that specific problem has: 1) known WU name 2) known host config and OS (current error was on Microsoft Windows Server 2008 Web Server x64 Edition, Service Pack 1, (06.00.6001.00)) 3) known boinc version one 80 hours waste per 19 results is a biggie ---------------------------------------- [Edit 2 times, last edit by rilian at Apr 6, 2010 7:39:13 PM] |
||
|
Randzo
Senior Cruncher Slovakia Joined: Jan 10, 2008 Post Count: 339 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Not so easy either.
There are many more circumpstances (AV settings, Boinc installation type and so on, OS patch version). And why one HPF2 WU run more than 80 hours? |
||
|
gb009761
Master Cruncher Scotland Joined: Apr 6, 2005 Post Count: 2979 Status: Offline Project Badges: ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
rilian, although I can certainly understand your concerns at loosing over 80 hrs of crunching time to a WU that eventually errored out, I'd have been querying this long-running WU after 12. Especially as you tried with HPF2 last year, and apparently had issues with it running on your machine(s) back then.
----------------------------------------Okay, you may not have been able to be physically next to the machine during all of this time, although under the circumstances, I'd have run a test HPF2 WU when you were available to monitor it. As I'm sure you're aware, currently on WCG, the only project which has WU's that go over 24 hrs, are the DDDT2 "A" types, and thus, I'd CERTAINLY would have queried this WU at that time. Perhaps, this WU was one that, very occasionally, get's stuck in some sort of a loop, and by performing a suspend/restart, completes normally. It certainly falls outside of the normal HPF2 failures, where they fail within a matter of seconds from it's start. Thus, as expressed above, I am sorry to hear that you've lost so much crunching time, not all of your anger can be directed at WCG. ![]() |
||
|
|
![]() |