Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Completed Research Forum: Computing for Sustainable Water Forum Thread: Known Issue with Linux stuck workunits [Resolved] |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 49
|
Author |
|
uplinger
Former World Community Grid Tech Joined: May 23, 2005 Post Count: 3952 Status: Offline Project Badges: |
I think that users should be allowed to make up their own mind about this. They can always opt out of cfsw until the issue is resolved if they don't want to run the risk of having part of their rig running idle. Leaving the users the freedom to choose what they want to do has always been part of the Linux philosophy. Also, stuck work units is nothing new, e.g. Rosetta@home had a similar problem for a long time and I recently had a stuck work unit with Poem@Home as well. There is always some risk that work units go awry, it is foolish to expect that they never do. pvh513, I agree with you that members should be allowed to make up this decision in their own mind. But there are more things to consider other than, "give us the choice". 1. Some members may not even know CFSW was released ( they are on vacation but were opted into the project as their preferences allowed for it ) 2. BOINC client does not properly report that there is an issue, just that a work unit is running for a long long time. 3. To create a mechanism to allow for linux users to choose how many they want for cfsw would take time and divert energy away from actually solving the issue at hand. Also, please note that we are not these other BOINC projects, we have a set of personal project rules that we try to achieve. A major one is, have little impact on a members machine. This means we strive to have BOINC run on your machine without you knowing it is there. Think of the normal computer user that only uses it for email and surfing the internet. They may have installed World Community Grid months ago, but have not checked on it since. We know there is a risk involved with all work units, but we try to keep these to a minimum and since we know there is a potential issue with multiple core machines running CFSW only, then to minimize this risk took an action that still allows members to contribute towards CFSW with little risk. I know for the badge hunters and people trying to get to the top level fast this puts a road block in their plan. Many of my machines (90%) are Linux so it would slow me down quite a bit. On a side note, it has slowed me down to 0 because all of the devices are testing within alpha for a potential fix. I'm sorry if this information is a bit winded, but I am just trying to point out the other side of the coin, which in good faith is the best option. Thanks, -Uplinger (edit: fixed some grammar) [Edit 1 times, last edit by uplinger at Apr 21, 2012 3:35:51 AM] |
||
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1316 Status: Offline Project Badges: |
You would like to see that...but you also have to view it from my stand point, I don't want anyone else with Sapphire beta :P ... Thanks, -Uplinger Too late, Keith! Am I now banned and if so, where to go? |
||
|
JmBoullier
Former Community Advisor Normandy - France Joined: Jan 26, 2007 Post Count: 3715 Status: Offline Project Badges: |
Keith,
----------------------------------------For 3 hours now I am trying to get this 1 unit per device without success. As it says, there are only tasks "committed to other platforms". And now there is not even any HCC available to spend time in the meantime. What is going wrong ? |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
windows empty to and the validaters seem to be shut down or something else.
no word as to why or what |
||
|
Jason1478963
Senior Cruncher United States Joined: Sep 18, 2005 Post Count: 295 Status: Offline Project Badges: |
79 linux cores on this project that are Idle and waiting for the solution :P
----------------------------------------[Edit 1 times, last edit by Jason1478963 at Apr 21, 2012 10:39:48 PM] |
||
|
Dark Angel
Veteran Cruncher Australia Joined: Nov 11, 2005 Post Count: 721 Status: Offline Project Badges: |
I hope the fix gets beta'd soon as I lose 40 cores on the 9th. I'll still be here after then regardless but it would be nice to level those on the project while I have them.
----------------------------------------Currently being moderated under false pretences |
||
|
sk..
Master Cruncher http://s17.rimg.info/ccb5d62bd3e856cc0d1df9b0ee2f7f6a.gif Joined: Mar 22, 2007 Post Count: 2324 Status: Offline Project Badges: |
!
----------------------------------------[Edit 1 times, last edit by skgiven at Jul 18, 2012 9:12:35 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Follow the instructions of this member post how to make your 'emergency' profile: http://www.worldcommunitygrid.org/forums/wcg/printpost_post,233436 ++
----------------------------------------Maybe you like to work out the details and propose additional locations at the developers A-list. Till then, BOINC world only knows Default, Work, School, Home for a client to work across multiple projects. @Uplinger Had one stuck, caught early in the act running on Ubuntu 11.10 (1 of 4 CFSW concurrent). CPU Q6600 at stock. World Community Grid 6.05 cfsw cfsw_0088_00088667_0 04:12:47 (03:44:52) 88,95 64,722 02:18:24 06d,15:36:25 19-4-2012 12:18:07 Running [6] 00:00:52 Did the LAIM off / suspend task trick and in the course, also set the Write to Disk ** at 30 seconds, so that interval saves are shortest [practically they occur about ~0.5% progress or under 2 minutes. When resumed manually, it kept running proper. Will watch if it validates as when going invalid, there's no real point in resuming these stuck units and forcing them to the end. ** Known under newest clients as "Tasks checkpoint to disk at most every xx seconds" --//-- edit: ++ WCG allows these private profiles, but does not support them formally as in conflict with the BOINC framework. Utilize at own risk and exepct the unexpected if WCG is not the only active project on a host! edit: correct link per moonian post. [Edit 3 times, last edit by Former Member at Apr 24, 2012 1:29:24 PM] |
||
|
Jason1478963
Senior Cruncher United States Joined: Sep 18, 2005 Post Count: 295 Status: Offline Project Badges: |
3. To create a mechanism to allow for linux users to choose how many they want for cfsw would take time and divert energy away from actually solving the issue at hand. Thanks, -Uplinger I think with the memory requirements of this project both windows and Linux this may be very useful like it was for Clean Energy 2. I have a Windows 7 machine running 6 of these and it is making it difficult to use that machine. I would like to see something like this implemented as it may be hard to keep the right balance of these running even with a mix of projects. It also seems a windows xp machine running 12 cores(dual 6 core opterons) and to many cores starting this project at once may be causing work units to error. Thanks for the help and info on the stuck work units. I'll be interested to see the fix for this or if its an issue with the 64 bit Ubuntu. Thanks again, Jason [Edit 1 times, last edit by Jason1478963 at Apr 22, 2012 8:31:19 PM] |
||
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Follow the instructions of this member post how to make your 'emergency' profile: https://secure.worldcommunitygrid.org/forums/wcg/addpost?parent=374422 ++ Maybe you like to work out the details and propose additional locations at the developers A-list. Till then, BOINC world only knows Default, Work, School, Home for a client to work across multiple projects. <snip> --//-- edit: ++ WCG allows these private profiles, but does not support them formally as in conflict with the BOINC framework. Utilize at own risk and exepct the unexpected if WCG is not the only active project on a host! I think the correct link is: http://www.worldcommunitygrid.org/forums/wcg/...ad,25839_offset,20#233436 P.S. And I tried to update the post to replace the img links, but I can't edit the post again (not edited for over 120 days). So, Admin, please help me update that post by replacing the imgs as the following URLs (don't think the full-res imgs can be directly embeded here now unless there's another free hosting service supports this): Step 1.2: http://imageshack.us/photo/my-images/10/54106074.png/ Step 1.3: http://imageshack.us/photo/my-images/406/45652452.png/ Step 2.10: http://imageshack.us/photo/my-images/407/210zi.png/ Step 2.11: http://imageshack.us/photo/my-images/215/211c.png/ [Edit 2 times, last edit by Former Member at Apr 23, 2012 5:45:41 AM] |
||
|
|