Hello,
I have cacti installation that monitor several Red Hat Enterprise 5.1 HP proliant servers.
On all of them is installed one and the same OS and one and the same HPASM package ( the latest ).
I am monitoring the CPU load on the servers with the standart Linux Template and the 1,5 and 30 load with the help of the HP agents for every separate cpu core.
3 days ago on 2 servers went to 100% load?! Which is not true? I tried to restart the snmpd and hpasm but nothing helped. The snmpwalk gives me -1 for 1 minute 100 for 5 and 30 minutes?
Both servers have uptime 520 days which caused a restart of the Uptime graph which is dicribed in a topic in this forum.
Hope someone can help?
RHEL and HPasm problem with load graphs
Moderators: Developers, Moderators
It apears to be rondom, one time the 1min is -1 sometimes ~100 ~95
Code: Select all
08/11/2009 12:05:16 PM - CMDPHP: Poller[0] Host[17] DS[752] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu5min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.4.1, output: 95
08/11/2009 12:05:16 PM - CMDPHP: Poller[0] Host[17] DS[751] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu30min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.3.1, output: 100
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[750] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu1min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.2.1, output: 100
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[749] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu5min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.4.0, output: 97
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[748] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu30min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.3.0, output: 100
08/11/2009 12:05:15 PM - CMDPHP: Poller[0] Host[17] DS[747] SNMP: v2: 192.168.2.102, dsname: cmpq_cpu1min, oid: .1.3.6.1.4.1.232.11.2.3.1.1.2.0, output: 100
- Attachments
-
- hpload.JPG (46.83 KiB) Viewed 2229 times
-
- Cacti User
- Posts: 51
- Joined: Thu Aug 06, 2009 9:18 pm
Is this a single core box? Which core are you monitoring if it is multi-core? If it's multi-core wouldn't a mean average of all the cores be better? In a multi-core environment it is possible to have one core with a high useage and the box remain stable. So when you say the core isn't really at 100% how do you know for sure? Have you looked to see if you have a process that's causing this?
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
In general, 2nd linkk of my sig would be a good start for debugging. But your second post already shows, that the OID (that seemingful represents the CPU) returns values around 100. You find those values in the graphs. So cacti is graphing what it is fetching.
But your question is more related to "why does this OID sometimes gives values around 100 when the load seen by top is around 2". To be honest, I can't answer that. If it is a private MIB of the manufacturer, the question should be asked there.
Of course people in the forums may have seen this. But again, to me it seems not to be a cacti problem.
Reinhard
But your question is more related to "why does this OID sometimes gives values around 100 when the load seen by top is around 2". To be honest, I can't answer that. If it is a private MIB of the manufacturer, the question should be asked there.
Of course people in the forums may have seen this. But again, to me it seems not to be a cacti problem.
Reinhard
Who is online
Users browsing this forum: No registered users and 1 guest