Index  | Recent Threads  | Unanswered Threads  | Who's Active  | Guidelines  | Search
 

Quick Go »
No member browsing this thread
Thread Status: Active
Total posts in this thread: 19
Posts: 19   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread
Author
Previous Thread This topic has been viewed 2141 times and has 18 replies Next Thread
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Problem on Multi-CPU Windows 2003 Server

I had problems running the WCG Agent on a Windows 2003 Server with 2 CPU/with hyperthreading = 4 Virtual CPUs. After the WCG Agent ran for some time, the machine would start to get real slow eventhough the Task Manager said only 25% of the CPU was being used (meaning the WCG Agent was running 100% across the 4 CPUs).

I think the problem arises because of the huge number of context switches that the WCG Agent causes across the CPUs and of course the Windows OS not being able to handle it.

I tried running a script that would fix the WCG Agent to one CPU by setting its affinity (see below), but every time the WCG Agent started a new task it would restart the UD.exe and start running across all CPUs again and the problem would occur again. Even running the every 15 minutes only delayed the problem.

I eventually had to uninstall the Agent on that machine which is a shame since it is idle 99.99% of the time. The Agent on Windows needs and Affinity setting so it can safely run on Windows machines with more than one processor.

If anyone has any other suggestions to solve this problem, please help out.

************ DOS Script ****************

ProcessMgr.exe -p WCGrid_Rosetta.exe Low 2>&1 1>> %LOG_FILE%
ProcessMgr.exe -p UD.exe Low 2>&1 1>> %LOG_FILE%
ProcessMgr.exe -p ud_1582756.exe Low 2>&1 1>> %LOG_FILE%

ProcessMgr.exe -a WCGrid_Rosetta.exe 1000 2>&1 1>> %LOG_FILE%
ProcessMgr.exe -a UD.exe 1000 2>&1 1>> %LOG_FILE%
ProcessMgr.exe -a ud_1582756.exe 1000 2>&1 1>> %LOG_FILE%
[Nov 3, 2005 3:58:40 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
confused Re: Problem on Multi-CPU Windows 2003 Server

Hello dkrawczynski,

Our Windows Agent is supposed to run only one instance. If a second instance starts running, then UD.exe shuts down one process. How are you monitoring your system? Task Monitor should show one thread running (of a possible 4, hence 25% utilization reported).

When you say things start slowing down, what do you mean? Have you tried running Throttle Watch (listed in the Useful Utilities thread at http://www.worldcommunitygrid.org/forums/wcg/viewthread?thread=2490 )? Perhaps your server is dust-filled and starts to overheat when running intensively?

mycrofth
[Nov 3, 2005 5:11:25 PM]   Link   Report threatening or abusive post: please login first  Go to top 
xterminatordedust
Cruncher
Joined: Nov 17, 2004
Post Count: 12
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

BOINC supports virtual computers.
If you can , try world grid community on Linux.
Otherwise run only one Windows agent.

You can use BOINC too, hoping that WCG use BOINC in the future.
[Nov 3, 2005 5:59:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
RT
Master Cruncher
USA - Texas - DFW
Joined: Dec 22, 2004
Post Count: 2636
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

I cannot imagine what is causing your system to slow down. The system you are running is not that uncommon so I suspect we have several if not many like it working. I wonder if the heat is an issue (where the P4s self protection mechanism kicks in and slows down the processor to keep the heat in check). I would suggest that you take that machine apart and blow the dust out and see if that makes any difference.
----------------------------------------
One of your friends in Texas cowboy
RT Website Hosting

[Nov 5, 2005 5:44:47 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Viktors
Former World Community Grid Tech
Joined: Sep 20, 2004
Post Count: 653
Status: Offline
Project Badges:
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

... hoping that WCG use BOINC in the future.

We do support BOINC on Linux now. See: http://www.worldcommunitygrid.org/viewJoinNow.do
We don't support it on Windows at this time.

As for slowing down, do you have other applications running at the same time which are using quite a bit of virtual memory? If the working set sizes of those other applications add up to close to your real memory size, then the additional 25mb working set needed by the Rosetta program might cause extra paging, which would slow things down somewhat.

Another source of mysterious slow-downs is bad sectors on a hard drive. If the software runs into these, the system spends a lot of time retrying the read or write operation, which can seem like long pauses in response time.

What is the exact symptom of your slow down? Are other applications not running? Is the start menu very slow?
[Nov 5, 2005 3:55:30 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

I have almost exactly the same issue here.

Dual 2.8GHz Xeon system with HyperThreading enabled (I know it slows things down overall, I just haven't been bothered turning it off yet), running Windows XP Professional.

Utilisation sits on 25% permanently even though it should be able to use 50% without even context switching.

No actual system slowdown here, though. Just inexplicably low CPU utilisation.
----------------------------------------
[Edit 1 times, last edit by Former Member at Nov 8, 2005 5:17:51 AM]
[Nov 8, 2005 5:16:05 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

I had WCG running on Windows Server 2003 for a while. I never experienced any slow downs though... Dual 3GHz w/ HT, so 4 procs in Taskmgr. I never ended up setting the affinity for the same reason as you, it will not stay there. 25% CPU utilizatin sounds right though, same thing was going on with my server. I'm unsure of what problems you are running into though, the only problem I ran into was DEP... DEP execution stopped my server from processing the WUs, but it would d/l and u/l them. Try setting DEP to only monitor essential Windows programs and see if that does anything.

Apart from that, I'm not too sure... The servers really kick butt though on crunching, tons of WUs get done in little time.
[Nov 9, 2005 1:42:27 AM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

Thanks for the many sugestions. I am actually running Small Business Server 2003 and I noticed it doesn't have Server 2003 SP1 which means it doesn't have DEP either. I'll download and install SP1 and make sure the BIOS is up-to-date for the machine (DELL PowerEdge 1800) and give the Agent another chance. I'll post a new message to let everyone know the results.

BTW. The machine is new and ran cool with the agent running (I used the Dell server monitor tool to check the component temps).

By running slow, I mean the SQL Server database basically stopped responding to clients and even working directly on the console to do things was slow until I stopped the agent and then the problem went away immediately.
[Nov 10, 2005 5:55:27 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

Hello dkrawczynski,

All the context switches? Are you sure that you do not have a memory leak? Are there any resources that are increasingly utilized after you run WCG for a while?

mycrofth
[Nov 10, 2005 6:03:42 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Former Member
Cruncher
Joined: May 22, 2018
Post Count: 0
Status: Offline
Reply to this Post  Reply with Quote 
Re: Problem on Multi-CPU Windows 2003 Server

Hello
When Windows task manager is displaying a Dual CPU with Hyper Threading 25% is shown for each Thread that is being used 100% So by seeing 25% you have one CPU fully utilized. If you had two CPU's max out the Task manager would show 50%. Over 50% would mean that you are actually utilizing Hyper Threading.
From the standpoint of the Grid for per process you will only ever see WCGrid use 25%. When I get my machine really busy with other processes the full machine will get to the 50% mark. When running something that actually can use Hyper Threading then you will see it go over the 50% mark which is roughly equal to 2 CPUs max out with Hyper Threading.
This does not mean that Hyper Threading is not used at the lower % it is just that you can only tell for sure when you go over the 50% mark. Not sure that I really trust the Windows task manager to truely show the thread usage in each window.
Also a lot of programs just really don't have the threading usage built into them. Unless you are running something that can use more than one thread at the same time Hyper Threading can't be utilized.
I saw a comment here about Hyper Threading slowing things down. You really just can't say that across the board. If when a program is running multiable threads it concentrates on a data set, it certainly does go faster. You only get into trouble with Hyper Threading when the program is doing many different things and starts to effectively split the cache with data and each thread then has what amounts to a half the cache size. That is on a per CPU basis. If the program starts other threads to get completely different tasks done, The chances of cache thrashing goes up.
Wayne
[Nov 12, 2005 1:57:10 PM]   Link   Report threatening or abusive post: please login first  Go to top 
Posts: 19   Pages: 2   [ 1 2 | Next Page ]
[ Jump to Last Post ]
Post new Thread