Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 117
Posts: 117   Pages: 12   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 497598 times and has 116 replies Next Thread
andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 195
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

Must say I'm dissapointed at the Linux implementation of MIP.
Pulling my Linux cores out of MIP to work on other projects. Windows cores are ok here and could stay.
Hopefully we will get some respons from techs or scientists on the progress of this issue.
----------------------------------------
/andgra



[Jan 8, 2018 6:51:30 AM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

Even on Windows, MIP1 does not run optimal.
I've noticed a strange behaviour on a Win7 Pro SP1 x64 host looking very close to a memory leak in the MIP1 implementation. In other words, even after stopping to compute MIP1 WUs, a large among of RAM is not properly released, even several days after the last MIP1 WU has been computed. After a host reboot, everything is going well again. I succeeded to reproduce this observation twice last November.
Since nobody at the tech and scientist side seems to take care of member's remark and observation about MIP1, I did not feel the need to report this issue until now.
Cheers,
Yves
----------------------------------------
[Jan 8, 2018 2:57:51 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RTS48
Veteran Cruncher
Bolivia
Joined: Aug 2, 2009
Post Count: 1353
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

Well - the saga continues. I am having to micro manage my WUs because 100% MIP is a disaster (50% is pretty bad too) I have 20 threads in 3 Macs. My oldest Mac which has only 4 threads is running SCC exclusively as it seizes up with MIP. The other two quad core (8 thread) Macs are set to run a maximum of 4 threads each of MIP when I make WUs available through device manager. I am still getting less than 20 points per hour CPU for MIP compared with nearer 25 points / h for SCC. I think that when I get my MIP Sapphire I will abandon the project entirely. Big shame!
----------------------------------------
Rod Peel
Santa Cruz
Bolivia
South America

,
,
[Feb 28, 2018 1:24:50 PM]   Link   Report threatening or abusive post: please login first  Go to top 
JimWork
Cruncher
Canada
Joined: Oct 11, 2005
Post Count: 35
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

I agree - this is one stingy stinker of a project. I'll stick with it until I get to 5yrs. I'm at 4:266:00 now and hope to cross the finish line in about ten days.
[Feb 28, 2018 11:38:35 PM]   Link   Report threatening or abusive post: please login first  Go to top 
KerSamson
Master Cruncher
Switzerland
Joined: Jan 29, 2007
Post Count: 1684
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

It is comforting to notice that I am not alone with my observations.
However, why did nobody take these issues into account on scientist side? ... and provide some feedbacks?
Cheers,
Yves
----------------------------------------
[Mar 1, 2018 7:56:32 AM]   Link   Report threatening or abusive post: please login first  Go to top 
andgra
Senior Cruncher
Sweden
Joined: Mar 15, 2014
Post Count: 195
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

I agree with u KerSamson!
----------------------------------------
/andgra



[Mar 2, 2018 10:54:12 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

Sorry it's taken us a while to respond. Please don't leave, we appreciate the time ALL of you have invested in the project. It has taken us a while to reproduce and then really pin down what causes the problem.

The short version is that Rosetta, the program being used by the MIP to fold the proteins on all of your computers*, is pretty hungry when it comes to cache. A single instance of the program fits well in to a small cache. However, when you begin to run multiple instances there is more contention for that cache. This results in L3 cache misses and the CPU sits idle while we have to make a long trip to main memory to get the data we need. This behavior is common for programs that have larger memory requirements. It's also not something that we as developers often notice; we typically run on large clusters and use hundreds to thousands of cores in parallel on machines. Nothing seemed slower for us because we are always running in that regime.

I don't know all of the details about how the points are assigned, and I don't know if/how the credit assignment will be modified. But I believe that issue stems from the fact that a single instance Rosetta is well behaved (very few cache misses) on most consumer chips, but on machines with smaller caches and few memory channels a second (or third or forth) instance cannot fit in to the caches and you see the run time scaling issues which result in fewer points/hour (i.e. if a single instance of Rosetta had these cache issues the scaling from one to multiple instances would not be as dramatic nor would the change in points/hour).**

We are looking to see if if we can improve the cache behavior. Rosetta is ~2 million lines of C++ and improving the cache performance might involve changing some pretty fundamental parts. We have some ideas of where to start digging, but I can't make any promises.

Long term, identifying these issues may end up improving Rosetta for everyone that uses it so pat yourselves on the back for that!

Doug

* a newer version of the program used in HPF1 & 2
** this might be the case for machine with very small (less than 4MB) caches, it's just always slow
[Mar 7, 2018 10:08:00 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

Doug,

First off, I just see you flagged as "Cruncher", whereas I suspect you're actually a "Project Scientist" or similar, no? Maybe one of the WCG techs can get you properly identified within the forum system?

Second, a big pat on the back from me for (a) looking into the problem and (b) taking the trouble to post with your findings. The fact that you're even considering changing the code is wonderful news! Also, knowing more accurately what the problem is may help people to ascertain what a reasonable number of parallel tasks is for their kit in the mean time.

Thank you!
[Mar 7, 2018 10:37:03 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

Thanks for the update. Over the next 3 weeks I will be finishing up some current work and then will be moving about 280 threads to MIP. The plan is stay there for at least 90 days. I'm not contributing for the points so the credit drop doesn't bother me in the least. Good luck with the optimization activities
[Mar 7, 2018 11:01:36 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Dayle Diamond
Senior Cruncher
Joined: Jan 31, 2013
Post Count: 452
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: sigificant credit drop - only for me or did someone else see this?

It's not really about a credit drop, Doneske, it's an efficiency drop.
The default mix of projects works out fine for MIP but when we you specialize all cores, your total throughput drops.

Every four-hours your core spends doing MIP might be two hours for another cruncher.
[Mar 8, 2018 2:23:44 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 117   Pages: 12   [ Previous Page | 3 4 5 6 7 8 9 10 11 12 | Next Page ]
[ Jump to Last Post ]
Post new Thread