RHEL and HPasm problem with load graphs

Gorbachov · Post by **Gorbachov** » Tue Aug 11, 2009 4:15 am

Hello,

I have cacti installation that monitor several Red Hat Enterprise 5.1 HP proliant servers.

On all of them is installed one and the same OS and one and the same HPASM package ( the latest ).

I am monitoring the CPU load on the servers with the standart Linux Template and the 1,5 and 30 load with the help of the HP agents for every separate cpu core.

3 days ago on 2 servers went to 100% load?! Which is not true? I tried to restart the snmpd and hpasm but nothing helped. The snmpwalk gives me -1 for 1 minute 100 for 5 and 30 minutes?

Both servers have uptime 520 days which caused a restart of the Uptime graph which is dicribed in a topic in this forum.

Hope someone can help?

Gorbachov · Post by **Gorbachov** » Tue Aug 11, 2009 4:25 am

It apears to be rondom, one time the 1min is -1 sometimes ~100 ~95

Code: Select all

08/11/2009 12:05:16 PM - CMDPHP: Poller[0] Host[17] DS[752] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu5min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.4.1, output: 95
08/11/2009 12:05:16 PM - CMDPHP: Poller[0] Host[17] DS[751] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu30min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.3.1, output: 100
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[750] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu1min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.2.1, output: 100
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[749] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu5min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.4.0, output: 97
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[748] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu30min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.3.0, output: 100
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[747] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu1min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.2.0, output: 100

cert-eh-fiable · Post by **cert-eh-fiable** » Tue Aug 11, 2009 7:33 am

Is this a single core box? Which core are you monitoring if it is multi-core? If it's multi-core wouldn't a mean average of all the cores be better? In a multi-core environment it is possible to have one core with a high useage and the box remain stable. So when you say the core isn't really at 100% how do you know for sure? Have you looked to see if you have a process that's causing this?

Gorbachov · Post by **Gorbachov** » Sat Aug 22, 2009 2:37 pm

It is a Dual Core CPU. And the graphs for both CPUs are 100% loaded...

With top/htop I see that the load is 1-2% on both cores.
On other graph that again graphs the CPU load I see the correct load.

I still have this problem

Post by **gandalf** » Sun Aug 23, 2009 1:54 pm

In general, 2nd linkk of my sig would be a good start for debugging. But your second post already shows, that the OID (that seemingful represents the CPU) returns values around 100. You find those values in the graphs. So cacti is graphing what it is fetching.
But your question is more related to "why does this OID sometimes gives values around 100 when the load seen by top is around 2". To be honest, I can't answer that. If it is a private MIB of the manufacturer, the question should be asked there.
Of course people in the forums may have seen this. But again, to me it seems not to be a cacti problem.
Reinhard

Gorbachov · Post by **Gorbachov** » Mon Aug 24, 2009 3:56 am

Yes it is not a Cacti issue, I just want to see if someone has the same problem.

I will post this topic in the HP forums to see if someone has the same issue...

Enjoy

RHEL and HPasm problem with load graphs

RHEL and HPasm problem with load graphs

Who is online