Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go ยป
No member browsing this thread
Thread Status: Active
Total posts in this thread: 146
Posts: 146   Pages: 15   [ 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 7291 times and has 145 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
An OSX bug

Howdy,

It seems that myself and the rest of the OSX users on my squad have a bit of a problem, and were hoping someone here might be able to provide a solution.

The problem is that one or more of the running tasks stalls with a frozen timer/progress and a running status. It will stay this way indefinitely until intervention by rebooting, closing/restarting BOINC or the trusty abort. The rate of occurrence is seemingly random but can occur within minutes or as long as a day regardless of which BOINC version is employed (5.4.9, 5.4.11, 5.8.11, 5.10.x).

I first noticed this bug in February/March or so with FAAH. It might be important to note that FCG DID NOT have this problem (and I crunched it exclusively as a result of this), but it is now apparent that both FAAH and HPF2 DO and it persists.

No error msgs are generated.

Occurs on both Intel and PPC macs running 10.4.x with memory equal to 1G/core.

Any help you all might be able to provide would most definitely assist in lowering my BP, reducing my dependency on antacids, and otherwise assure my longevity for many years of future crunching. wink

Thanks.

-SS
[Jul 21, 2007 5:06:14 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

We did have a report of this earlier, but we were unable to pin down the exact cause of the problem. As I remember, it only affected FAAH according to that report. Since a new version of the FAAH code is about to be released, I advised waiting for that to see if the problem existed in the new code.

This is the first I've heard of this particular behaviour with HPF2, although there is a similar (very rare) bug that sometimes stalls HPF2 on any platform (Windows, definitely).

If/when this happens again, please will you try suspending and resuming computation, instead of the more brute-force methods you have tried successfully?

I will bring this to the attention of the techs, and perhaps they will be able to do an extended beta test of the new FAAH code on the Mac platform (normal beta tests only involve a few hundred work units, and very rare problems aren't always picked up on).

The big difference between GC on one side and HPF2/FAAH on the other is that GC uses a comparison algorithm (SSEARCH) - it is doing memory manipulation and integer arithmetic. HPF2 and FAAH both do extraordinary amounts of floating point mathematics. Read this: http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.18 for a particularly scary example of the pitfalls awaiting the WCG techs.
[Jul 21, 2007 7:04:11 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Diana G.
Master Cruncher
Joined: Apr 6, 2005
Post Count: 3003
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: An OSX bug

The big difference between GC on one side and HPF2/FAAH on the other is that GC uses a comparison algorithm (SSEARCH) - it is doing memory manipulation and integer arithmetic. HPF2 and FAAH both do extraordinary amounts of floating point mathematics. Read this: http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.18 for a particularly scary example of the pitfalls awaiting the WCG techs.


Didactylos, the link is not connecting.

.
----------------------------------------

[Jul 21, 2007 10:42:24 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

Works for me in Firefox 2.0. But just in case IE or some other browser is screwing it up, here's a http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.18 different way of making a link in BBCode.

edit: the site may be having DNS problems. Try again later.
----------------------------------------
[Edit 1 times, last edit by Former Member at Jul 21, 2007 10:57:44 AM]
[Jul 21, 2007 10:55:53 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

Brink,

Thanks for keeping us wayward CA's (me) on the straight and narrow... moved content to the other thread. blushing

cheers
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
----------------------------------------
[Edit 1 times, last edit by Sekerob at Jul 21, 2007 5:50:28 PM]
[Jul 21, 2007 1:37:21 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

Brink,

Looked yesterday and see 4 add-ins from the redmondians in my Firefox plug-in folder.

unicows.dll (which seems rather new and love the name ;>)
npwmsdrm.dll
npdrmv2.dll (also very new)
npdsplay.dll

I expect them to work for Netscape. I did have to permit scripting for your site as it did give a dll error prior when it tried loading the WM player.

Hey Sek, isn't this regarding this thread?
http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=15380
I'm sure just a copy and paste error d oh

As to the OSX problem. I have this same problem. Suspending and resuming the computation does not help. You have to shut boinc down and re-start it to get it working again. I am trying one thing, under my processor preferences I unchecked "Allow Nap". I've noticed it gets checked without my doing. Wondering if that might be the problem.
Edit: To get the processor prefs you need to have the developer tools loaded. I think the tools are called CHUD or something like that. It's on the install dvd.
----------------------------------------
[Edit 1 times, last edit by Former Member at Jul 21, 2007 5:34:09 PM]
[Jul 21, 2007 3:23:05 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Sekerob
Ace Cruncher
Joined: Jul 24, 2005
Post Count: 20043
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

Brink,

Thanks for catching me on that one.... I use "It's All Text" add-in and somehow it linked it to a second thread I was writing something for. The add-in allows the use of any word processor. Anytime hitting save would paste the text back into the thread post box.... not this time.

Regarding the Suspend/Resume, the proper way is *not* suspending the job, Seen this repeated a few times.... The finesse is suspending the *project* altogether and wait 30 second before resume. olympic wrote this tip a while ago and it has never failed on me in the 3 times i has this.

Only know of HPF2 suffering the infinite looping at full use of idle CPU time.... the FAAH issue is new to me too.
----------------------------------------
WCG Global & Research > Make Proposal Help: Start Here!
Please help to make the Forums an enjoyable experience for All!
[Jul 21, 2007 6:21:17 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

We've tried suspending/resuming the task and project and boinc from the top menu to no avail. The only method that appears to work are the aforementioned brute force methods

TBH, Waiting 30 seconds doesnt make things much better from the user side. I run 16 dedicated macs of various flavors and its easier to just restart the herd than go to each machine to see if its stalled.

Not to be combative, but based on the small number of the people in my team that run OSX (they all report the problem), and my experience, I wouldn't characterize the problem as "rare" and is present in both the remaining projects (FAAH, and HPF2) though they might be different problems (with the same symptoms). The strange part is it rarely if ever, affects more than 50% of the cores available - often its only one. Some also have the impression that HPF2 has this problem "to a lesser degree" than FAAH, though I personally cant verify this - Its still a royal PITA with 16 machines. wink

If I can be of any assistance in hunting this thing down please LMK here. In the meantime, I suppose my only option is to bootcamp windows on these things and get them back to work as things are quite untenable as they currently stand. <sigh> More licensing revenue for MSFT... sad

-SS

P.S. This CHUD is on the OSX install DVD?
[Jul 21, 2007 7:05:11 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: An OSX bug

P.S. This CHUD is on the OSX install DVD?

Yes, on the first DVD look in : Xcode Tools -->packages-->chud. Once installed do a software update. When installed you should have a "Processor" icon in your System Preferences".
I've been told you can download it off the Apple web site...if you can find it. thinking
I will post it here if I can find the URL.
edit: Found it...http://developer.apple.com/tools/download/
----------------------------------------
[Edit 1 times, last edit by Former Member at Jul 21, 2007 9:21:22 PM]
[Jul 21, 2007 9:16:28 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Diana G.
Master Cruncher
Joined: Apr 6, 2005
Post Count: 3003
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: An OSX bug

Diana
Didactylos
The big difference between GC on one side and HPF2/FAAH on the other is that GC uses a comparison algorithm (SSEARCH) - it is doing memory manipulation and integer arithmetic. HPF2 and FAAH both do extraordinary amounts of floating point mathematics. Read this: http://www.parashift.com/c++-faq-lite/newbie.html#faq-29.18 for a particularly scary example of the pitfalls awaiting the WCG techs.


Didactylos, the link is not connecting.

.


silly Kudos to all Techies everywhere who deal with this stuff! peace
----------------------------------------

[Jul 22, 2007 12:35:06 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 146   Pages: 15   [ 1 2 3 4 5 6 7 8 9 10 | Next Page ]
[ Jump to Last Post ]
Post new Thread