Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 6
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2365 times and has 5 replies Next Thread
Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Stuck GFAM Work Units

I have two wu's that stopped making any CPU progress. They look very similar to the stuck wu's I get sometimes in the Lesh. project. Both of these are still running but are making no progress.

DFAM_1df7_TBdhfrDry_0000246_0655_1
Running almost 17 hours with CPU stuck at 06:00:20

DFAM_1df7_TBdhfrDry_0000231_0014_1
Running almost 4 hours with CPU stuck at 00:29:50

Suspend/resume task; start/stop BOINC; reboot machine but with no effect. I will look at them again in about an hour; if no progress I will blow them away. sad

Win7, i7 970 processor, BOINC 6.12.34, GPU's are idle.

confused
----------------------------------------


[Nov 15, 2011 3:05:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
gb009761
Master Cruncher
Scotland
Joined: Apr 6, 2005
Post Count: 3007
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck GFAM Work Units

Dataman, before you 'blow them away', it might be worthwhile taking a copy of all the data within the particular slots and letting the techs have a copy. You never know, it may help them in some way...
----------------------------------------

[Nov 15, 2011 3:36:02 PM]   Link   Report threatening or abusive post: please login first  Go to top 
armstrdj
Former World Community Grid Tech
Joined: Oct 21, 2004
Post Count: 695
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck GFAM Work Units

Dataman,

Are you familar with and do you have the process explorer application? If so can you launch it next time you get a stuck vina workunit and look at something. In process explore find the process listing for vina, should have a name like wcg_gfam_vina_6.08_windows_intel86. Right click on it and select properties from the pop up menu. In the properites window that opens there is a threads tab. Click on it and there should be 1 or 2 threads listed. Highlight each and look at the state value below. What state do the threads show?

Also do you run CEP2 on a machine that has stuck workunits for DSFL or GFAM and if so does CEP2 ever have stuck workunits?

Thanks,
armstrdj
[Nov 16, 2011 11:25:15 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Stuck GFAM Work Units

Who's going to draft a best practices "Stuck WU" actions FAQ (Windows oriented as that's the platform that suffers these events).

Include items such as:

A) Info needed for debug:

- Capturing the result info in stuck state (as what armstrdj asks with PE).
- References to useful help topics on the internet, such as this for PE:
Ten Best Practices / Troubleshooting Tips with Process Explorer Tool
Process Explorer, Part 2 (On Thread Analysis)
Link to the Process Explorer tool, of course.
Slot information and message log copy. Debug flags to set.

B) Actions / workarounds / tips:

- Stop client , LAIM on/off to force memory unload on suspend of troubled task.
- BOINC throttle to 100% and using TThrottle instead if the throttle is temperature control motivated (it's *only* design purpose)
- Use of "While the processor usage is greater than XX suspend computing" (works a dream on me Linux box btw to prevent heartbeat issues on the heavy duty CEP2).

Open a new forum thread, polish live in iterations or stick it in WCG WIKI so it can be collaboratively tweaked. Then we'll include it in the Start Here FAQ's and you'll be added to the FAQ Hall of Fame (full credit with honors).

Who's Game? Don't be shy and share your skills to further the cause!

--//--
[Nov 17, 2011 10:57:22 AM]   Link   Report threatening or abusive post: please login first  Go to top 
pcwr
Ace Cruncher
England
Joined: Sep 17, 2005
Post Count: 10903
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck GFAM Work Units

One of mine got stuck at 95%, after a laptop reboot, it completed.

Patrick
----------------------------------------

[Nov 17, 2011 11:16:20 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Dataman
Ace Cruncher
Joined: Nov 16, 2004
Post Count: 4865
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Stuck GFAM Work Units

Dataman,

Are you familar with and do you have the process explorer application? If so can you launch it next time you get a stuck vina workunit and look at something. In process explore find the process listing for vina, should have a name like wcg_gfam_vina_6.08_windows_intel86. Right click on it and select properties from the pop up menu. In the properites window that opens there is a threads tab. Click on it and there should be 1 or 2 threads listed. Highlight each and look at the state value below. What state do the threads show?

Also do you run CEP2 on a machine that has stuck workunits for DSFL or GFAM and if so does CEP2 ever have stuck workunits?

Thanks,
armstrdj

Hi armstrdj
Normally I would post everything but we are currently in an RV ~2,000 miles from where my servers are. I have been managing them remotely using some software and the help of my house sitters. I only have a connection when we stop and get a WiFi connection to allow me to connect to my CA network.
I can say I am running only GFAM and have had only one reoccurance. I had a similar problem with the DFSL Beta. I don't run CEP2 and have not run DSFL for a week or more.
I hope someone else can post the information you need. I cannot get into the files you need.
Beautiful morning in San Antonio ... cheers!
----------------------------------------


[Nov 17, 2011 2:46:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
[ Jump to Last Post ]
Post new Thread