| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 16
|
|
| Author |
|
|
novakjara
Cruncher Joined: Oct 18, 2008 Post Count: 4 Status: Offline Project Badges:
|
These WU got stuck at about 3-4%:
faah36560_ ZINC09819018_ xh2_ xtal_ 01_ 1-- faah36622_ZINC58246698_xh2_xtal_02 |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Try Suspend, then Resume. If that does not work, then try a reboot.
Lawrence |
||
|
|
novakjara
Cruncher Joined: Oct 18, 2008 Post Count: 4 Status: Offline Project Badges:
|
Well, I already aborted these. Other WU's seem to be running just fine. Let's wait and see what happens on resend. If it happens again, I'll try your advice. Thx.
|
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Before doing the "Suspend" procedure to unstick tasks not making progress [See Start Here FAQ index for expanded explanation], the LAIM function [Leave Application in Memory, when suspended] has to be switched off! Else, the task will *not* unload from memory.
FAAH being stuck is an extremely rarely reported event. If at all, it's the HPF2 doing this. Permanent cure never found, not reproducible in labs, but to say that there is a possible correlation, [long time ago] I saw between HPF2 failing and FAAH/HFCC running simultaneous. That was when HPF2 still showed the 401 error. Never seen it with the ten times less frequent 711 error. |
||
|
|
novakjara
Cruncher Joined: Oct 18, 2008 Post Count: 4 Status: Offline Project Badges:
|
Not rare for me -(. Most of my latest FAAH tasks ended this way. Latest one :
----------------------------------------faah36669_ ZINC19902838_ xh2_ xtal_ 03_ 1-- is currently jumping between 4,7% and 5,0%. Neither suspending nor rebooting helps. After couple minutes it's the same. No way will I babysit every task. I'll try reattaching to the project after all my running WU's finish. If that doesn't help - bye. EDIT. As I Wrote this, it's up to 6.3% and climbing. Is this cca 10 min. period normal? I guess it needs patience and not watch the progress bar too closely. Will see. [Edit 1 times, last edit by novakjara at Nov 22, 2012 8:43:09 AM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
The oscillation near a successful docking point conclusion is normal [searching for the lowest energy docking position]. Open up the FAAH graphics screen to watch the C-Energy such as displayed in this sample screenshot for old and new [BOINC] agent: http://i137.photobucket.com/albums/q210/Sekerob/FAAHBestEnergygraphic.png . On a fast computer it will be quick, on a slow computer it can take longer... watching it takes forever, so please allow it time ;>) (That is, if it oscillates a hundred times, then indeed it could be stuck and lot more system/client info is needed).
|
||
|
|
novakjara
Cruncher Joined: Oct 18, 2008 Post Count: 4 Status: Offline Project Badges:
|
Thak you, SekeRob for a good answer as always. I guess I¨ll excersize a little more patience. Good job.
|
||
|
|
LAZA74
Advanced Cruncher Germany Joined: Sep 28, 2008 Post Count: 56 Status: Offline Project Badges:
|
Same problem here (but posted on the wrong thread):
----------------------------------------https://secure.worldcommunitygrid.org/forums/wcg/viewpostinthread?post=414071 Stop/Start and Reboot doesn't help so i aborted these two. Edit: As i now remember - i didn't reboot my machine, it crashed while crunching and i think, this could be the problem of the two WUs (save point?)
NAS - Eigenbau
----------------------------------------Xiaomi Mi 10T [Edit 2 times, last edit by LAZA74 at Mar 1, 2013 4:27:05 PM] |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hello LAZA74, and novakjara
----------------------------------------With my AMD 1090T processor, my experience with the SekeRob-described 'oscillation' in a FAAH WU is that the WU ends up in a computer-error status (after some stuck loops) if the load-calibration parameter at the BIOS is set to 'Auto'. Setting the said parameter to 'Enabled' (with the effect of increasing the core-voltage and/or the 'stiffness' of that voltage to the 1090T CPU) made the said error no longer occur. In other words, try increasing the core-voltage. The load-line calibration parameter at the BIOS should be able to do that job, else you may have to explore manually increasing the core-voltage. ; ; andzgridPost#976 ; [Edit 1 times, last edit by Former Member at Mar 1, 2013 8:11:09 PM] |
||
|
|
adriverhoef
Master Cruncher The Netherlands Joined: Apr 3, 2009 Post Count: 2346 Status: Offline Project Badges:
|
Well, I had a FAAH WU stuck at more than 97%, restarting every 8s. I guess it had been running for 20 or 30 hours before I noticed the problem with that task. 450 restarts an hour, that's at least 9000 restarts.
It's faah43166_ZINC14190771_xPR_wC6_11_1ref9_02_0 using faah version 715. I tried Suspending (not leaving the application in memory) and Resuming, rebooting; still it didn't help, unfortunately: after waiting another day and then 2 more hours of running in vain I decided to abort that FAAH WU. $ rpm -qa boinc* |
||
|
|
|