| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 29
|
|
| Author |
|
|
bieberj
Senior Cruncher United States Joined: Dec 2, 2004 Post Count: 406 Status: Offline Project Badges:
|
Thanks for the updates - I see that it eliminated that pesky 29 / 1d bug. Great job!
|
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
Copy _3 of the WU mentioned in my post above has completed with a clean log. However, copy _2 seems to demonstrate a flaw in the system.
----------------------------------------It has run for too little time (0.63h) and its log shows "No heartbeat from core client for 30 sec - exiting". Why is it a candidate for validation? Would it previously have quit with Exit code 29? Do we have a problem, Houston? E000510_276A_003d6y014_3 | Pending Validation | 23/04/09 00:37:51 | 23/04/09 13:51:47 | 7.80 | 120.4 / 0.0 | Mine, clean log, CEP v6.31 E000510_276A_003d6y014_2 | Pending Validation | 22/04/09 17:16:52 | 23/04/09 13:56:16 | 0.63 | 11.9 / 0.0 | Suspicious candidate for validation E000510_276A_003d6y014_1 | Error | 20/04/09 05:34:26 | 23/04/09 00:33:22 | 1.13 | 13.1 / 0.0 | Exit code 29 E000510_276A_003d6y014_0 | Error | 20/04/09 05:34:22 | 22/04/09 17:15:54 | 1.58 | 33.5 / 0.0 | Exit code 29 [Update]: Copies _2 and _3 were next declared Inconclusive and copy _4 issued. E000510_276A_003d6y014_4 | Valid | 23/04/09 16:27:27 | 23/04/09 21:38:52 | 0.49 | 8.7 / 8.7 | Clean log. Validated against _2. Copy_3 ruled Invalid. [Edit 1 times, last edit by Rickjb at Apr 24, 2009 12:27:56 AM] |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
Rickjb,
This was a good test case and actually uncovered a flaw in our validation. Your result might be valid. The result is likely invalid. We have cleaned up how we handle this case and things have progress to the next step where the results are marked INCONCLUSIVE and an additional copy is being sent out to a reliable computer so that the system can determine the correct result. I would expect that this additional result will be returned in the next 24-36 hours. Please post what you see happen. thanks, Kevin |
||
|
|
softstag
Cruncher Joined: Feb 26, 2009 Post Count: 16 Status: Offline Project Badges:
|
I don't know if it's just me, but since the upgrade, I'm seeing smaller volumes of page faults. They are still there, but about half what I was getting!
![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
I don't know if it's just me, but since the upgrade, I'm seeing smaller volumes of page faults. They are still there, but about half what I was getting! ![]() I'm seeing the same thing. I have two tasks currently running on one of my dual core's, each process's page fault delta is sub 2000 (about 1700 average) whereas it used to be in the higher 6000s |
||
|
|
Rickjb
Veteran Cruncher Australia Joined: Sep 17, 2006 Post Count: 666 Status: Offline Project Badges:
|
@knreed: Please note the update to my post above re. suspicious candidate for validation.
Thanks for your interest in the events. - Rick - |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
To sum it thus up,
----------------------------------------_4 and _2 were matching, short though thought suspicious, declared valid (probably version 6.31) _0 and _1 were short but had the Exit Code 29, getting error state (surely version 6.30) _3 the only running longer, not matching thus declared invalid, version 6.31 Results that logged loss of heartbeat are not necessarily bad. It merely signals that the core client and science were not able to communicate for 30 seconds or longer. On older client versions the core client would kill the science ending in error. Cause, often overzealous firewall/av, mostly also when the BOINC manager was loaded.
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
I don't know if it's just me, but since the upgrade, I'm seeing smaller volumes of page faults. They are still there, but about half what I was getting! ![]() You don't say which Type (A/B), but having a new look in Process Explorer for Type A data, after 3 hours CPU time showed a mean of about 1.6k PD delta, opposed to 6/7/8k. The other main measure is the kernel time and that showed a mere 47 seconds. Having stopped micromanaging, noted also from the Result status pages that FAAH/DDDT/HFCC seemed to be much lesser side-effected and getting credit closer to claim, some above, some below. Curiously, for these E000526's the peak ram after 3:20 hours was just 24Mb. VM 321Mb The forces work in silence and hardly celebrate the achievements, they really don't, abhorrent of disappointing someone for the exception, jumping up to report "but still for me ".
WCG
----------------------------------------Please help to make the Forums an enjoyable experience for All! [Edit 1 times, last edit by Sekerob at Apr 24, 2009 10:23:13 AM] |
||
|
|
teletran
Senior Cruncher Joined: Jul 27, 2005 Post Count: 378 Status: Offline |
Just posting back to say that I've had no errors and all results have been valid since the software update.
-------------------------------------------------------------------------------- [Edit 1 times, last edit by teletran at Apr 24, 2009 12:38:22 PM] |
||
|
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges:
|
@knreed: Please note the update to my post above re. suspicious candidate for validation. Thanks for your interest in the events. - Rick - We are taking a deeper look at what is going on. Something caused the other two result to exit early while yours was able to finish. We will let you know what we find. thanks for following up. |
||
|
|
|