| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 15
|
|
| Author |
|
|
Jim1348
Veteran Cruncher USA Joined: Jul 13, 2009 Post Count: 1066 Status: Offline Project Badges:
|
The first thing I saw was that two ARP1 WUs were 'waiting to run'. Then I realised that five others were running. That meant that SEVEN lots of memory had been allocated for these WUs -- not leaving much for the other things I'd like to run from time to time. You can eliminate the extra memory allotted for those "waiting to run" by de-selecting "Leave non-GPU tasks in memory". It is no big deal for performance; you just have to load an application from the disk again when it is needed. |
||
|
|
Crystal Pellet
Veteran Cruncher Joined: May 21, 2008 Post Count: 1406 Status: Offline Project Badges:
|
You can eliminate the extra memory allotted for those "waiting to run" by de-selecting "Leave non-GPU tasks in memory". It is no big deal for performance; you just have to load an application from the disk again when it is needed. . . . and for the ARP1's you will loose hours of crunching time ![]() |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
Apis
I suspect from what you initially said that you have 8 threads available on your machine. That is how many I have and I find that with 6 running arp1 it runs slower. I would suggest that you reduce the threads allowing arp1 to 4. The other projects can share the other 4 and you will boost total output. As for re-sends, if they are soon after the initial send (say up to about 24 hours) then they come with the standard 7 day return, but if they are later, then they come with 3.5 days to finish. Even with the shorter period, they don't necessarily jump the queue immediately but might do so later. Mike |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hi Mike,
Thanks for your feedback and comments. I had been running MIP1 and found that I couldn't run too many (more than 4, IIRC) without impacting performance, presumably due to the well known L3 cache 'problem'. With ARP1 running as well as MIP1 I found I needed to run even fewer MIP1 or it impacted ARP1. But now that I run (usually) 6 ARP1 and 2 MCM1 I don't think ARP1 is falling over itself. However, I tend to look at points/hr and not elapsed time, so the situation may not be straightforward. As to the scheduling issue, I did subsequently see another situation when a seventh ARP1 WU was shown in status 'Waiting to run', but it had an elapsed time of 0 and wasn't using much memory. When one of the executing ARP1 WUs finished, I was surprised to find that it hadn't started, but another one had. Nothing I tried would get that one to start properly and in the end I killed it. I put it down to 'Just one of those (annoying) things'. |
||
|
|
Mike.Gibson
Ace Cruncher England Joined: Aug 23, 2007 Post Count: 12594 Status: Offline Project Badges:
|
apis
My primary purpose is to help the projects with badge-hunting secondary, so I have been juggling combinations in Device Profiles and app_config to try to achieve the best result. My current app_config settings when I have arp1 units is 3 arp1, 4 mcm1 & 3 mip1. That tends to allow me to run 3 arp1, 3 mcm1 & 2 mip1. If I have no arp1 units, like now, mcm1 I put mcm1 up to 6 and mip1 up to 4. That gives me 5 mcm1 and 3 mip1. My Device Profile are set to 6 arp1 (3*2 to keep the machine 'reliable'), 6 mcm1 and 4 mip1 at all times, so that updates occur regularly to try to get more arp1 whenever possible. As the capacity problems with arp1 and mip1 are different, this maximises the throughput I also temporarily pause any units which get too close to the one in front to spread the peaks. In other words I prevent 'tailgating'. My current problem is coming next on the other thread. Mike |
||
|
|
|