CPU utilisation - user procs dissapeared ?
Moderators: Developers, Moderators
CPU utilisation - user procs dissapeared ?
Guys
The below graph is from an oracle database on a PE 6850 that serves as a backend for siebel.
Look at the area between 8am and 10am. We had a major problem with some queries that were causing the clients to hang.
The problem is this : usually you get a high count of user procs since all oracle procs run under the user 'oracle'.
Assuming that "system procs" are considered the ones running under root I don't understand the graph.
When I logged on at around 8.30 and did a top I saw tons of oracle procs using up most of the cpu.
Q1 - Why do I see an increase in system procs instead of user procs ?
Q2 - Why are all the user procs completely gone in that time period (no blue graph at all)
How can there be no user procs at all within this time period ?
I saw this one single time before 2 years ago on 2 nodes of an oracle RAC system. Same exact thing - after a reboot the blue came back.
I'm trying to interpret this for the dba's but I'm not sure I can.
Can anyone help me ?
DB runs Oracle 10g on RHEL4-U5 using net-snmp to poll. The PE 6850 has 4 sockets - each dual-cored. HT per core is turned off so linux sees 8 procs.
Thanks
--stucky
The below graph is from an oracle database on a PE 6850 that serves as a backend for siebel.
Look at the area between 8am and 10am. We had a major problem with some queries that were causing the clients to hang.
The problem is this : usually you get a high count of user procs since all oracle procs run under the user 'oracle'.
Assuming that "system procs" are considered the ones running under root I don't understand the graph.
When I logged on at around 8.30 and did a top I saw tons of oracle procs using up most of the cpu.
Q1 - Why do I see an increase in system procs instead of user procs ?
Q2 - Why are all the user procs completely gone in that time period (no blue graph at all)
How can there be no user procs at all within this time period ?
I saw this one single time before 2 years ago on 2 nodes of an oracle RAC system. Same exact thing - after a reboot the blue came back.
I'm trying to interpret this for the dba's but I'm not sure I can.
Can anyone help me ?
DB runs Oracle 10g on RHEL4-U5 using net-snmp to poll. The PE 6850 has 4 sockets - each dual-cored. HT per core is turned off so linux sees 8 procs.
Thanks
--stucky
- Attachments
-
- CPU util
- cpustats.JPG (35.37 KiB) Viewed 3234 times
Gandalf
Dh'ou - I totally forgot about cacti's zooming capabilities - thx for reminding me.
Attached are 2 zooms
1. The weird one.
2. A normal one from a day later.
I'm not sure it reveals anyhing more other than that something was very different on day one.
Could it be that if system procs were higher than user procs the red would overshadow the blue ?
Not that this could ever happen on a busy database with 300 plus oracle procs running but if it was the case I wonder how cacti would graph it.
I've been poking though other graphs and I can't find any that don't show at least a little blue except this one we're talking about.
Dh'ou - I totally forgot about cacti's zooming capabilities - thx for reminding me.
Attached are 2 zooms
1. The weird one.
2. A normal one from a day later.
I'm not sure it reveals anyhing more other than that something was very different on day one.
Could it be that if system procs were higher than user procs the red would overshadow the blue ?
Not that this could ever happen on a busy database with 300 plus oracle procs running but if it was the case I wonder how cacti would graph it.
I've been poking though other graphs and I can't find any that don't show at least a little blue except this one we're talking about.
- Attachments
-
- cacti_sbl.JPG (79.97 KiB) Viewed 3208 times
Ok I looked at values and they don't make sense to me.
According to my last post the system procs were in fact higher than user procs.
1. I don't see how that could be when a top showed oracle totally hogging the cpu.
2. I attached another graph from when we had bad queries before and they look totally like I'd expect them to look.
What I don't understand here though is that all 3 values for system and user procs are nearly identical - yet we have red graphs shortly under 40% and blue graphs in the 80% range.
I'm probably not reading this right. Hope this info helps.
According to my last post the system procs were in fact higher than user procs.
1. I don't see how that could be when a top showed oracle totally hogging the cpu.
2. I attached another graph from when we had bad queries before and they look totally like I'd expect them to look.
What I don't understand here though is that all 3 values for system and user procs are nearly identical - yet we have red graphs shortly under 40% and blue graphs in the 80% range.
I'm probably not reading this right. Hope this info helps.
- Attachments
-
- cacti_sbl_high_normal.JPG (45.97 KiB) Viewed 3205 times
RRDTool Command:
/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-300 \
--title="SBLPRD 2 - CPU Usage" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="percent" \
--slope-mode \
DEF:a="/var/www/html/cacti-0.8.6j/rra/sblprd_2_cpu_system_875.rrd":cpu_system:AVERAGE \
DEF:b="/var/www/html/cacti-0.8.6j/rra/sblprd_2_cpu_user_876.rrd":cpu_user:AVERAGE \
DEF:c="/var/www/html/cacti-0.8.6j/rra/sblprd_2_cpu_nice_874.rrd":cpu_nice:AVERAGE \
CDEF:cdefbc=TIME,1196707863,GT,a,a,UN,0,a,IF,IF,TIME,1196707863,GT,b,b,UN,0,b,IF,IF,TIME,1196707863,GT,c,c,UN,0,c,IF,IF,+,+ \
AREA:a#FF0000:"System" \
GPRINTLAST:"Current\:%8.2lf %s" \
GPRINTAVERAGE:"Average\:%8.2lf %s" \
GPRINTMAX:"Maximum\:%8.2lf %s\n" \
AREA:b#0000FF:"User":STACK \
GPRINTLAST:" Current\:%8.2lf %s" \
GPRINTAVERAGE:"Average\:%8.2lf %s" \
GPRINTMAX:"Maximum\:%8.2lf %s\n" \
AREA:c#00FF00:"Nice":STACK \
GPRINT:c:LAST:" Current\:%8.2lf %s" \
GPRINT:c:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:c:MAX:"Maximum\:%8.2lf %s\n" \
LINE1:cdefbc#000000:"Total" \
GPRINT:cdefbc:LAST:" Current\:%8.2lf %s" \
GPRINT:cdefbc:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefbc:MAX:"Maximum\:%8.2lf %s"
RRDTool Says:
OK
/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start=-86400 \
--end=-300 \
--title="SBLPRD 2 - CPU Usage" \
--rigid \
--base=1000 \
--height=120 \
--width=500 \
--alt-autoscale-max \
--lower-limit=0 \
--vertical-label="percent" \
--slope-mode \
DEF:a="/var/www/html/cacti-0.8.6j/rra/sblprd_2_cpu_system_875.rrd":cpu_system:AVERAGE \
DEF:b="/var/www/html/cacti-0.8.6j/rra/sblprd_2_cpu_user_876.rrd":cpu_user:AVERAGE \
DEF:c="/var/www/html/cacti-0.8.6j/rra/sblprd_2_cpu_nice_874.rrd":cpu_nice:AVERAGE \
CDEF:cdefbc=TIME,1196707863,GT,a,a,UN,0,a,IF,IF,TIME,1196707863,GT,b,b,UN,0,b,IF,IF,TIME,1196707863,GT,c,c,UN,0,c,IF,IF,+,+ \
AREA:a#FF0000:"System" \
GPRINTLAST:"Current\:%8.2lf %s" \
GPRINTAVERAGE:"Average\:%8.2lf %s" \
GPRINTMAX:"Maximum\:%8.2lf %s\n" \
AREA:b#0000FF:"User":STACK \
GPRINTLAST:" Current\:%8.2lf %s" \
GPRINTAVERAGE:"Average\:%8.2lf %s" \
GPRINTMAX:"Maximum\:%8.2lf %s\n" \
AREA:c#00FF00:"Nice":STACK \
GPRINT:c:LAST:" Current\:%8.2lf %s" \
GPRINT:c:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:c:MAX:"Maximum\:%8.2lf %s\n" \
LINE1:cdefbc#000000:"Total" \
GPRINT:cdefbc:LAST:" Current\:%8.2lf %s" \
GPRINT:cdefbc:AVERAGE:"Average\:%8.2lf %s" \
GPRINT:cdefbc:MAX:"Maximum\:%8.2lf %s"
RRDTool Says:
OK
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Ok, those are AREA/STACKs as should be. No error here. You may of course dump the rrd file's contents (e.g. using dataquery, else rrdtool fetch) to make sure the rrd file does not contain other numbers.
Aah, stop, wait, ...
Is this a multi core CPU? In this case, "user" proc may have exceeded the value of 100 which is the (wrong) default MAXIMUM of the proc data source. Upper the MAXIMUM to "number of cores * 100" and apply same change to all existing rrd files of this type using "rrdtool tune"
Reinhard
Aah, stop, wait, ...
Is this a multi core CPU? In this case, "user" proc may have exceeded the value of 100 which is the (wrong) default MAXIMUM of the proc data source. Upper the MAXIMUM to "number of cores * 100" and apply same change to all existing rrd files of this type using "rrdtool tune"
Reinhard
- fmangeant
- Cacti Guru User
- Posts: 2345
- Joined: Fri Sep 19, 2003 8:36 am
- Location: Sophia-Antipolis, France
- Contact:
Higandalf wrote:Aah, stop, wait, ...
Is this a multi core CPU? In this case, "user" proc may have exceeded the value of 100 which is the (wrong) default MAXIMUM of the proc data source. Upper the MAXIMUM to "number of cores * 100" and apply same change to all existing rrd files of this type using "rrdtool tune"
Reinhard is right : with Net-SNMP < 5.4 on Linux boxes, CPU usage goes from 0 to 100 x number of procs.
You can use the template in my signature for 2, 4 and 8 CPU systems.
[size=84]
[color=green]HOWTOs[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15353]Install and configure the Net-SNMP agent for Unix[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=26151]Install and configure the Net-SNMP agent for Windows[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=28175]Graph multiple servers using an SNMP proxy[/url][/list]
[color=green]Templates[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15412]Multiple CPU usage for Linux[/url]
[*][url=http://forums.cacti.net/viewtopic.php?p=125152]Memory & swap usage for Unix[/url][/list][/size]
[color=green]HOWTOs[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15353]Install and configure the Net-SNMP agent for Unix[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=26151]Install and configure the Net-SNMP agent for Windows[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=28175]Graph multiple servers using an SNMP proxy[/url][/list]
[color=green]Templates[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15412]Multiple CPU usage for Linux[/url]
[*][url=http://forums.cacti.net/viewtopic.php?p=125152]Memory & swap usage for Unix[/url][/list][/size]
Thanks everybody - I'm trying the 2/4/8 way templates right now.
The PE 6850 has 8 cores but HT can be turned on per core so linux sees
16 cpus. Your templates refer to cores only right ?
I know HT is not the same thing as a core but then again linux doesn't make a difference. It sees it all as a cpu.
Thoughts ?
PS: Time to update the default templates for cacti maybe ? I mean who runs single core systems these days ?
The PE 6850 has 8 cores but HT can be turned on per core so linux sees
16 cpus. Your templates refer to cores only right ?
I know HT is not the same thing as a core but then again linux doesn't make a difference. It sees it all as a cpu.
Thoughts ?
PS: Time to update the default templates for cacti maybe ? I mean who runs single core systems these days ?
- fmangeant
- Cacti Guru User
- Posts: 2345
- Joined: Fri Sep 19, 2003 8:36 am
- Location: Sophia-Antipolis, France
- Contact:
To know how many CPU your Linux system sees, run that :
Code: Select all
$ grep -c ^processor /proc/cpuinfo
[size=84]
[color=green]HOWTOs[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15353]Install and configure the Net-SNMP agent for Unix[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=26151]Install and configure the Net-SNMP agent for Windows[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=28175]Graph multiple servers using an SNMP proxy[/url][/list]
[color=green]Templates[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15412]Multiple CPU usage for Linux[/url]
[*][url=http://forums.cacti.net/viewtopic.php?p=125152]Memory & swap usage for Unix[/url][/list][/size]
[color=green]HOWTOs[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15353]Install and configure the Net-SNMP agent for Unix[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=26151]Install and configure the Net-SNMP agent for Windows[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=28175]Graph multiple servers using an SNMP proxy[/url][/list]
[color=green]Templates[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15412]Multiple CPU usage for Linux[/url]
[*][url=http://forums.cacti.net/viewtopic.php?p=125152]Memory & swap usage for Unix[/url][/list][/size]
Well if you really go by what linux sees as a cpu then it's 16. I already knew that cause when you do a 'top' and press '1' it shows 16 cpu's there ( your test confirms too) but I thought you were referring to cores only and this box has 8. I always thought a core is real whereas HT is not considered quite the same as a core. Then again linux can't seem to distinguish.
This means I need another template then right ?
This means I need another template then right ?
- fmangeant
- Cacti Guru User
- Posts: 2345
- Joined: Fri Sep 19, 2003 8:36 am
- Location: Sophia-Antipolis, France
- Contact:
Yesstucky101 wrote:This means I need another template then right ?
I'll try to post a template tomorrow.
[size=84]
[color=green]HOWTOs[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15353]Install and configure the Net-SNMP agent for Unix[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=26151]Install and configure the Net-SNMP agent for Windows[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=28175]Graph multiple servers using an SNMP proxy[/url][/list]
[color=green]Templates[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15412]Multiple CPU usage for Linux[/url]
[*][url=http://forums.cacti.net/viewtopic.php?p=125152]Memory & swap usage for Unix[/url][/list][/size]
[color=green]HOWTOs[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15353]Install and configure the Net-SNMP agent for Unix[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=26151]Install and configure the Net-SNMP agent for Windows[/url]
[*][url=http://forums.cacti.net/viewtopic.php?t=28175]Graph multiple servers using an SNMP proxy[/url][/list]
[color=green]Templates[/color] :
[list][*][url=http://forums.cacti.net/viewtopic.php?t=15412]Multiple CPU usage for Linux[/url]
[*][url=http://forums.cacti.net/viewtopic.php?p=125152]Memory & swap usage for Unix[/url][/list][/size]
Guys what do you think about this thread ?
http://forums.cacti.net/viewtopic.php?t ... &start=120
Sure tempting to get each cpu graphed separately as well...
http://forums.cacti.net/viewtopic.php?t ... &start=120
Sure tempting to get each cpu graphed separately as well...
Guys
Any news on the 16-way template ?
Also I have found that the new templates don't graph quite the way the old ones did.
It used to adjust the max based on the average of the utilisation.
Since I applied the new graps it always shows the 100% mark even if utilization is very low.
Doesn't make for as good a graph in my opinion. Can I adjust that so it graphs like before except with the correct number of cores ?
Any news on the 16-way template ?
Also I have found that the new templates don't graph quite the way the old ones did.
It used to adjust the max based on the average of the utilisation.
Since I applied the new graps it always shows the 100% mark even if utilization is very low.
Doesn't make for as good a graph in my opinion. Can I adjust that so it graphs like before except with the correct number of cores ?
- Attachments
-
- cpu-8way.png (136.22 KiB) Viewed 2950 times
Who is online
Users browsing this forum: No registered users and 0 guests