| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 129
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello jahman,
I advise you to terminate Wcgrid_Rosetta using Task Manager to download a new work unit. Your present HPF2 work unit is running into a bug that acts like an endless loop. We are trying to locate the bug, but until it is fixed use Dagorath's Definitive Guide to HPF2 Dumping here: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=7876#64047 Lawrence |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello jahman, I advise you to terminate Wcgrid_Rosetta using Task Manager to download a new work unit. Your present HPF2 work unit is running into a bug that acts like an endless loop. We are trying to locate the bug, but until it is fixed use Dagorath's Definitive Guide to HPF2 Dumping here: http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=7876#64047 Lawrence thank you sir |
||
|
|
BobCat13
Senior Cruncher Joined: Oct 29, 2005 Post Count: 295 Status: Offline Project Badges:
|
BOINC User 114748, Host ID= 47910
Ubuntu 6.06 LTS za095_00863_11 WU checkpoints to 76.923%, then runs approximately 16 minutes before the CPU drops to 0% usage. I have stopped/started the boinc-client 4 times with the same result each time. I have suspended the WU and backed up the directory in case the techs would like to see any of the files. i.e. stderr.txt file. |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
BobCat13, I think if u abort the unit, it will automatically report, the result log telling which segment it got stuck....the result log u can see, once u send it back...then look in the Results detail page and click on error....if u dont, another copy might get send out. But prior to doing that, see how many errors that WU has accumulated already....sending stops after 4 errors were collected and copy 5/6 were distributed.
----------------------------------------
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
teletran
Senior Cruncher Joined: Jul 27, 2005 Post Count: 378 Status: Offline |
I tried running HPF2 again and sent in three results:
----------------------------------------1 valid 1 Invalid 1 Inconclusive |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Teletran u should be
---------------------------------------- ....all 3 give points long as the HPF2 is in the bug fix period......whats questionable is where six (6), the max turn in on an 'error only' quorum result, all running dead on the same point in the WU.....grants no CPU time or canonical credit. At least i'd expect the CPU time used. As for points recognition the lowest of quorum .....oh well, when i smell out 4 with 2 open 'in progress' i just hit the abort button. Why would 5 and 6 succeed.....a statistical non-event on the many of these already seen....the 2 i saw mentioned above did exactly as per the RickH predictions.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
BobCat13
Senior Cruncher Joined: Oct 29, 2005 Post Count: 295 Status: Offline Project Badges:
|
But prior to doing that, see how many errors that WU has accumulated already....sending stops after 4 errors were collected and copy 5/6 were distributed. Only 1 error so far on that WU, but there have been 8 No Reply as the WU does not finish. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I've got a workunit thats been running for 198hrs (downloaded ~july12) but shows as 0% complete. I've never seen this before, and workunits on this computer are usually completed in less than a day (device #149620). I'm using the ud platform agent 3.0(2844).
----------------------------------------What do I do with it?? (Thank you, lawrencehardin) [Edit 1 times, last edit by Former Member at Jul 21, 2006 11:05:53 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
BOINC User ID 225561, Host ID 41341.
Error #14: za120_ 00856, returned 07/20/2006 15:34:22, aborted after 2.3 hours of normal checkpoints with exit code 10, Exception code: 0xc0000005, Exception address: 0x00488EB6. 3 other copies still in progress. This is the first Error I've gotten in 6 days. Since I was getting about 1 a day before, and the science app hasn't been updated, it looks like recent WUs have learned how to avoid triggering crashes as often. Or I've been really lucky. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello UpstateLabs,
You have run into a bug in HPF2, so bring up Task Manager and terminate Wcgrid_hpf2_rosetta. This will cause a new work unit to download. Lawrence |
||
|
|
|