Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Smash Childhood Cancer Thread: SCC WU Length |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 25
|
Author |
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Yup, knew to have penned a Community FAQ, back in 2007 https://www.worldcommunitygrid.org/forums/wcg...ead,16378_offset,0#128672 long lost from the collective memory
|
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7581 Status: Offline Project Badges: |
I agree with SekeRob in his notes on suspending and then restarting. I had not really thought about rebooting on an 80 core machine (boy I wish I had one). At one point long ago I had this problem and eventually found out it was caused somehow by some inconsistent communication problems which were solved by upgrading my range extender. When I had the problem I just took the easy way out and rebooted when necessary on an 8 core machine.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1932 Status: Offline Project Badges: |
So that's what that glare was... all those badges... Well, that's one of the problems you face living in Antarctica... Here, point them all over there, my house plants need some more sun... Though here in SoCal, we have to try and stay dry recently, had the first day with sunshine from sunrise to sunset for the first time in weeks... ;-) Ralf |
||
|
TPCBF
Master Cruncher USA Joined: Jan 2, 2011 Post Count: 1932 Status: Offline Project Badges: |
Unless all SCC jobs display these ever extending runtimes [and lack of progress on the percentage, and lack of checkpointing], I'd refrain from booting 8-16-32-80 core devices, and first explore disabling LAIM (leave application in memory when suspended **), and then suspend the problem job(s) in question for 30 seconds [after first suspending all the 'ready to start' jobs], then restart them and see if crunching resumes, with the 'normal progress of precentage and checkpointing. The problem has been that these are only one (or maybe two) WUs at a time that show this behavior, all others, running at the same time, on the same machine, with the same settings, are working just fine. So that rules out any setup issues...** New speak: Leave non-GPU task in memory when suspended. Old speak, how is BOINC installed... service/user, what security software could be taking a grip in the science app? Done 700 now on different OSses and never a blip/burp/belch. Ralf |
||
|
Yavanius
Senior Cruncher Antarctica Joined: Jan 21, 2015 Post Count: 191 Status: Offline Project Badges: |
Here, point them all over there, my house plants need some more sun... Well, that's one of the problems you face living in Antarctica... Though here in SoCal, we have to try and stay dry recently, had the first day with sunshine from sunrise to sunset for the first time in weeks... ;-) That's only half the year, the other half... now if we could just find some way to keep the sunshine bouncing back and for. The good thing is no salefolks and no JW folks (although they mean well). I'm waiting for the day the first Girl Scout shows up. She'd better fill the plane up with cookies though as there's gonna be a serious run on them. And I know there's been a few days of SoCal sun from up to down recently, not just one. ;) ~Y |
||
|
|