ucd/net - CPU Usage - User -> NaN!

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
mlnospam
Posts: 3
Joined: Mon Mar 27, 2006 1:58 pm

ucd/net - CPU Usage - User -> NaN!

Post by mlnospam »

Hello,

I am trying to get the CPU Usage graphed of various hosts on our network, unforutunately everything works fine except for the "User" value, user just gets NaN as result. So the user parameter doesn't get graphed but System and Nice are OK. Now to be sure it's not a problem with the server itself I tryed this with other servers running other version of net-snmp but still the same problem.

I am using the latest version of Cacti and RRDTool on Solaris 10 (SPARC), that's our monitoring host.

Does anyone have an idea what's the problem here ? Any help would be greatly appreciated.

Many thanks in advance

Best regards

PS: I have attached a sample graph so that you can see how it looks like with the missing "user" value.
Attachments
cpu_usage.png
cpu_usage.png (31.74 KiB) Viewed 20626 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please check

Code: Select all

snmpwalk -c xxxx -v 1  <target host> .1.3.6.1.4.1.2021.11
UCD-SNMP-MIB::ssIndex.0 = INTEGER: 1
UCD-SNMP-MIB::ssErrorName.0 = STRING: systemStats
UCD-SNMP-MIB::ssSwapIn.0 = INTEGER: 0
UCD-SNMP-MIB::ssSwapOut.0 = INTEGER: 0
UCD-SNMP-MIB::ssIOSent.0 = INTEGER: 3
UCD-SNMP-MIB::ssIOReceive.0 = INTEGER: 5
UCD-SNMP-MIB::ssSysInterrupts.0 = INTEGER: 4
UCD-SNMP-MIB::ssSysContext.0 = INTEGER: 4
UCD-SNMP-MIB::ssCpuUser.0 = INTEGER: 5
UCD-SNMP-MIB::ssCpuSystem.0 = INTEGER: 2
UCD-SNMP-MIB::ssCpuIdle.0 = INTEGER: 91
UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 49101958
UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 0
UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 21404990
UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 749970670
UCD-SNMP-MIB::ssCpuRawWait.0 = Counter32: 55037420
UCD-SNMP-MIB::ssCpuRawKernel.0 = Counter32: 21220636
UCD-SNMP-MIB::ssCpuRawInterrupt.0 = Counter32: 184354
UCD-SNMP-MIB::ssIORawSent.0 = Counter32: 300669432
UCD-SNMP-MIB::ssIORawReceived.0 = Counter32: 2145935664
UCD-SNMP-MIB::ssRawInterrupts.0 = Counter32: 2697872243
UCD-SNMP-MIB::ssRawContexts.0 = Counter32: 1533510369
UCD-SNMP-MIB::ssCpuRawSoftIRQ.0 = Counter32: 0
UCD-SNMP-MIB::ssRawSwapIn.0 = Counter32: 28
UCD-SNMP-MIB::ssRawSwapOut.0 = Counter32: 52
to verify, that this host responds to the wanted OID UCD-SNMP-MIB::ssCpuRawUser.0
Reinhard
mlnospam
Posts: 3
Joined: Mon Mar 27, 2006 1:58 pm

Post by mlnospam »

Well it does respond yes. I think the problem is maybe due to the fact that this server uses multiple Pentium IV Xeon CPUs and maybe the value is too high because it reflects the values of not only one CPU but 2 or more. What do you guys think ?
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

You may be correct. Check MAXIMUM values of the corresponding data source of the Data Template. If you plan to change this, remember that existing rrd files must be rrdtool tuned for this, too
Reinhard
NoogiE
Posts: 2
Joined: Mon Jun 12, 2006 5:43 pm

Post by NoogiE »

I've been experiencing these same problems on a dual Xeon machine.

Mlnospam - I think you are probably right in saying that a dual cpu machine can have a load up to 200% - which will break the graphing. Currently a server I manage is reporting 150% user cpu load.

To fix I changed the maximum calue of the data template, as lvm suggested, and also rrdtool tune'd the existing graph.

Graphs are all working again now and have done so for the last few weeks. I'm not sure how hyperthreading could affect these values, but I am yet to see load above 200%

Perhaps the associated data template should be adjusted in future cacti releases? (I dont see the problem with having a higher maxium cpu figure in the data template).

Greg.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

All of you may want to follow the discussion/FAQ on CPU reporting at net-snmp.org. But be prepared to get frustrated :cry:
Reinhard
foobie
Posts: 3
Joined: Tue Jun 27, 2006 12:24 pm

Post by foobie »

Thanks for this discussion - my cpu graphs were looking really odd before :-)

The commands I used, in case anyone is stuck:

Code: Select all

rrdtool tune HOST_cpu_nice_46.rrd --maximum cpu_nice:200
rrdtool tune HOST_cpu_system_47.rrd --maximum cpu_system:200
rrdtool tune HOST_cpu_user_48.rrd --maximum cpu_user:200
obviously your HOST and the numbers will differ...
timvandijk
Posts: 3
Joined: Thu Jul 20, 2006 9:20 am

Post by timvandijk »

Hi,

I experience the same problem with regards to the cpu raw user graph. We have a two XEON processor system with Hyperthreading disabled. Suddenly the user portion of the graph disappeard with no obvious reason. The maximum value in de datasource was set automatic. We've had however a performance problem during the period that the user portion of the graph disappeared.

I'm absolutly sure that there must have been cpu user activity. Does anyone have an idea about this strange behaviour?

Kindest regards,

Tim
Attachments
Example of CactiGraph with missing cpu user graph
Example of CactiGraph with missing cpu user graph
cacti_no_cpu_user.png (8.66 KiB) Viewed 19648 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

I'm quite sure this is due to clipping of high user cpu values. Perform a rrdtool info on that rrd file and see sth similar to
rrdtool info gandalf_cpu_user_10.rrd |more
filename = "gandalf_cpu_user_10.rrd"
rrd_version = "0001"
step = 300
last_update = 1153321503
ds[cpu_user].type = "COUNTER"
ds[cpu_user].minimal_heartbeat = 600
ds[cpu_user].min = 0.0000000000e+00
ds[cpu_user].max = 1.0000000000e+02
ds[cpu_user].last_ds = "268239"
ds[cpu_user].value = 4.1720000000e+01
ds[cpu_user].unknown_sec = 0
With the max value set to 100 (=default). This may clip off data in multi-cpu systems. Change it using rrdtool tune
Reinhard
timvandijk
Posts: 3
Joined: Thu Jul 20, 2006 9:20 am

Post by timvandijk »

Hi Ivm,

The value for cpu_user (and of course cpu_nice en cpu_system) were indead set to a maximum of 100%:

ds[cpu_user].type = "COUNTER"
ds[cpu_user].minimal_heartbeat = 600
ds[cpu_user].min = 0.0000000000e+00
ds[cpu_user].max = 1.0000000000e+02
ds[cpu_user].last_ds = "30306547"
ds[cpu_user].value = 4.9091000000e+02
ds[cpu_user].unknown_sec = 0

I've changed this value using rrdtool tune. Fortunately the performance problems have been solved so I don't expect to see any abnormalities in the near future. I'll just have to wait and see.

Thanks for your input.

Kindest regards,

Tim
timvandijk
Posts: 3
Joined: Thu Jul 20, 2006 9:20 am

Post by timvandijk »

Hi Ivm,

Earlier than I expected I can now report the results are great. I now can see the "true" performance graphs. On the other hand I'll need to do a thourough perfomance analysis because I suspect a structural performance problem.

Thanks again!

Tim
pheezy
Cacti User
Posts: 61
Joined: Thu Oct 26, 2006 5:30 pm

Post by pheezy »

I'm having the same problem, however when I tuned the rrd, the following is now reported, and even setting it back 100 causes the same thing. :(
ds[cpu_user].max = NaN
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please post the exact command you've run to set it back to 100 and the resulting rrdtool info
Reinhard
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest