| Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
| World Community Grid Forums
|
| No member browsing this thread |
|
Thread Status: Active Total posts in this thread: 83
|
|
| Author |
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
With reference to the issues highlighted in the thread '[RENAMED] Some concerns regarding the HC... hosts; #cores>2)' - this thread is started to enable the Techs to post any updates in their investigation of the issues, and also for members to post their own observations which may assist them in finding the cause and remedy.
Posted by knreed: Sorry for the silence. We are investigating this problem. We are currently running some pair-wise tests to determine what is causing the slow down. What I mean by pair-wise is that we manually sending a workunit for FightAIDS@Home and a workunit for Help Conquer Cancer to the same two computers that have the characteristics that we want to check. For example here is the outcome of one of these tests: Computer #1 is my laptop. It is running a Intel Pentium M running at 2.0GHz with 2GB of DDR Ram Computer #2 is my home desktop. It is running a AMD Athlon 64 X2 5200+ running at 2.7GHz with 2GB of DDR2 Ram FightAIDS@Home workunit 'faah2961_ZINC00000480_xMut_md02740_00' had the following results: Computer #1 ran the workunit in 7.6 hours and claimed 83.238 credits for the workunit. Computer #2 ran the workunit in 6.1 hours and claimed 94.623 credits for the workunit (20% faster then the pentium M) Help Conquer Cancer workunit 'X0000046720001200502241630' had the following results: Computer #1 ran the workunit in 7.8 hours and claimed 85.014 credits for the workunit Computer #2 ran the workunit in 8.2 hours and claimed 128.109 credits for the workunit (5% slower then the pentium M) The next step we are going to do repeat this test but force the AMD dual core to only run one workunit at a time to see if eliminates the drop in performance for the dual core machine. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Good to see some testing is taking place Kevin - it does seem to bear out what others are seeing with the units taking longer to crunch with multi-cores, and I believe it is exagerated the more cores there are, so maybe some comparisons with 4/8 core machines may be worth looking into.
Comparisons have also been made over at XtremeSystems too whilst this has been going on, so I'll throw this in the mix for you to maybe consider whilst investigating: I've said this in a bunch of other places. Running HCC on 64bit Vista with 64 bit BOINC version 5.10.28 has provided excellent results. Run time fall very close around an average of 2.7 hours apiece, page faults in the thousands. This page fault number is lower than I see on DDT or FAAH. I'm running it in ten quads........ |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Hey Adywebb
Keep the work up. This is really all people want to see! ![]() |
||
|
|
courine
Master Cruncher Capt., Team In2My.Net Cmd. HQ: San Francisco Joined: Apr 26, 2007 Post Count: 1794 Status: Offline Project Badges:
|
BTW, 2 excelent quotes from the earlier thread:
----------------------------------------knreed is looking into this. For the moment the best workaround is to put badly affected systems onto other projects. Lawrence And for some backround: I suppose I'll dip my toe into the water again. Brrr. . . it's chilly! A page fault occurs when the core wants to access memory that is not loaded into cache. This will slow things down because the kernel will have to load a new page from memory into the cache, while the core waits. Any application with a lot of page faults will run more slowly than one with only a few. But there is a second possible problem with performance. Multiple cores can 'queue up' a series of page faults so that each core has to wait until its own page fault gets serviced. This is called memory contention. If a number of cores are running applications with a high number of page faults, then performance will drop even more because of this memory contention. How can this performance inefficiency be cured? The normal way is to run a preprocessing step over the data arrays and produce a new array that clusters data together the way that the program will access it. Sometimes this is possible. Unfortunately, sometimes it is not. It all depends on the algorithm. Even when it is possible, it produces unreadable data structures. This need not be a problem but when developing a new program that has to be rapidly changed to match research needs it is almost always a problem. [A personal reminiscence. A generation ago I spotted a neat 15-25 line section in an image processing assembler routine that I could optimize to speed up the program by 10%-15%. Even with paperwork, this change only cost me 2 or 3 days and we were running it constantly on a number of computers, so I considered it time well spent. I actually congratulated myself about this. (sob..) A little more than 2 years later the new computers changed the cache organization and I suddenly realized that my change was bound to cause problems down the road if the cache changed even more. After thinking it over for an hour, I eliminated the change. Programming to meet specific cache designs is very dangerous practice that has to be considered very suspiciously.] So what is my estimate of the situation? I don't think that it makes sense to reprogram the application for this. The project scientists should be concentrating on the results and overworking the programmers to change the application to produce better results. Faster should be ignored at this stage. But how should individual members of the World Community Grid feel about this? The high page-fault count is simply an artifact of the algorithm. It will slow down the flops/second but that will not matter as such. The CPU time spent running the kernel to load in new pages will show up as reduced credit, but for a single core the points impact should not be substantial. Memory contention will be much more substantial, so 4 and 8 core machines would show a much greater drop in points if running more than 1 HCC work unit. The WCG scheduler is sending out the HCC work units so a member can eliminate HCC from these multi-core machines without slowing down progress on HCC. And they could then run other projects such as FAAH and DDT that would otherwise have to run on the single core computers that can handle HCC with the greatest efficiency. An unrelated note. Some days ago someone posted a work unit awarded 8.3 points in this thread. This was immediately reported to the WCG staff. I don't know what went wrong and we have a number of more urgent issues, but it is an error unrelated to the main problem being addressed in this thread. Lawrence Not bad, I like the cut of your jib. ![]() ![]() ![]() |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
That description of a page fault is completely wrong.
Page faults have nothing to due with cpu cache, nor does an OS kernal load cpu cache. A page fault occurs when a requested piece of code or data is not in physical memory. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Questar, thank you for your input.
Now, please go and read about the difference between hard and soft page faults. You will find the subject is less simple than you thought. |
||
|
|
Former Member
Cruncher Joined: May 22, 2018 Post Count: 0 Status: Offline |
Bravo!
![]() |
||
|
|
courine
Master Cruncher Capt., Team In2My.Net Cmd. HQ: San Francisco Joined: Apr 26, 2007 Post Count: 1794 Status: Offline Project Badges:
|
I am running an exchange program in the suggestion box. If you have a single core machine and are not currently running HCC, I will exchange for some nice dual core time for your project.
----------------------------------------The Exchange ![]() ![]() ![]() |
||
|
|
Sekerob
Ace Cruncher Joined: Jul 24, 2005 Post Count: 20043 Status: Offline |
Here's an interesting stat: There were 829 W98 'All time' registrations and they run 5.64 RAC average.... that's 39.48 WCG points per day. Seeing some here still running this actively the word should be spread: Switch it off. The Electricity consumption is not helping any, Berkeley is *NOT* testing for Backward compatibility, thus upgrading BOINC is not suggested.
----------------------------------------Propose that WCG consider to remove W98 and ME from the System Requirements list for at least HCC & AC@H. http://boincstats.com/stats/host_os_stats.php?pr=wcg&st=0
WCG
Please help to make the Forums an enjoyable experience for All! |
||
|
|
jal2
Senior Cruncher USA Joined: Apr 28, 2007 Post Count: 422 Status: Offline Project Badges:
|
I'm not sure I understand this statistic Sek. Is this 829 active W98 machines, or one active W98 machine and 828 inactive machines, producing a total of 4,673 RAC? I suspect the number is somewhere in between.
----------------------------------------As for W98, it's still a good gaming platform, and will run on my AMD64 3000+ if I wanted to. I think the focus should be on the CPU, not the OS. Average credit per CPU: 2,301.88 Dual-Core AMD Opteron(tm) Processor 8216 HE 227.88 AMD Athlon(tm) 64 FX-74 Processor Looks like I need to upgrade. ![]() ---------------------------------------- [Edit 1 times, last edit by jal2 at Feb 1, 2008 12:19:12 PM] |
||
|
|
|