Ad blocker detected: Our website is made possible by displaying online advertisements to our visitors. Please consider supporting us by disabling your ad blocker on our website.
What I'm confused about are the high values as result. But they seem to be delivered by the other Linux boxes as well- and they report the correct value in Cacti...
Wow, that's nice. Never saw this before.
Please first use snmpwalk to walk against those OIDs for CPU metrics (find the correct OIDs from Settings -> View Poller Cache -> Filter for the host).
Please be aware of the fact, that (at least net-snmp) has some trouble with CPU data. Latest net-snmp (5.4) changed CPU measures.
Reinhard
gandalf wrote:Wow, that's nice. Never saw this before.
Please first use snmpwalk to walk against those OIDs for CPU metrics (find the correct OIDs from Settings -> View Poller Cache -> Filter for the host).
Ehm- what should be the difference to the above snmpwalk? And, I still have an older version of Cacti running (other topic, can't upgrade) where I don't see these log entries- what are you looking for exactly?
Please be aware of the fact, that (at least net-snmp) has some trouble with CPU data. Latest net-snmp (5.4) changed CPU measures.
What could this mean for my issue? I'm using net-snmp-5.3.1-24.el5_2.2 from CentOS 5.
I also did some modifications to my ucd/net-CPU - template to monitor IO-stats and like the screenshot above my graph shows a total value of much more than 100%. So I guess the vertical label "Percent" is wrong in the original graph too.
But what am I actually looking at here?
In my understanding 100% should be the total cpu-bandwidth which is splitted in
user/nice/system/iowait/hard-irq/soft-irq/stolen/idle.
the original graphtemplate and datasources dont have any percent-calculation - they just read and show the following snmp-values (counter)
ssCpuRawUser.0
ssCpuRawNice.0
ssCpuRawSystem.0
I added the following as stack to the graph
ssCpuRawWait.0
and see what I got. How can I now get the "real percentage" which is - naturally - limited by 100% maximum. I looked up the multi-cpu-templates as well and no "percentage-calculation" in there.
I'd like to graph values that can be compared to the %-values I get when I run the top-command.
thnx
peter
ps: In my graph there is lot of io-waits cause I did a copy from a virtualdisk to a nfs-share - both on the same harddisk
Ok - I solved my question. It shows more than 100% cause I've got a quad-cpu and the RawValues are just added up, so it gets 400%. And the graph-template doesnt need percent-calculation, cause the counters are based on seconds and so is the cacti-poller, so things turn out fine
To the OP: it seems you have a quad-cpu-system as well and after reboot you seem to have a dual-cpu-system anymore. Maybe some wrong calcs in your template or maybe something was changed in your system when you rebooted. New kernel, new VM... ?
gruad23 wrote:
To the OP: it seems you have a quad-cpu-system as well and after reboot you seem to have a dual-cpu-system anymore. Maybe some wrong calcs in your template or maybe something was changed in your system when you rebooted. New kernel, new VM... ?
I don't know what happened here. The monitored machine above is a physical one and has been rebooted. As far as I can see no change in kernel appeared (yum is started automatically).
My template shouldn't be wrong because it monitors all other machines (even with multiple CPU) right. And at the stage when the issue appeared cacti hasn't been touched nor rebooted.
So really confusing. I tried to figure out what's going wrong, but the snmpwalk show raw values- I don't know how they are converted in cacti internally.
knebb wrote:
I don't know what happened here. The monitored machine above is a physical one and has been rebooted. As far as I can see no change in kernel appeared (yum is started automatically).
...
So really confusing. I tried to figure out what's going wrong, but the snmpwalk show raw values- I don't know how they are converted in cacti internally.
First I would check if your system really still sees 4 cpus by checking /proc/cpuinfo. Maybe two cores got disabled in BIOS or whatever. (my dell poweredge can do that)
for snmpwalk and cacti : cacti reads the counters every 5 minutes:
the types are counter32 so cacti calculates the difference to the last value (5minutes ago) and divides by 300 (5 minutes = 300 seconds) to get the values per second and displays it. No more calculations should be done.
The calculation would be wrong if you messed with settings in cacti like heartbeatvalue of your datasource, but I doubt you did this ....
knebb wrote:
First I would check if your system really still sees 4 cpus by checking /proc/cpuinfo.
This was the first thing I checked. And yes, it still reports four CPUs.
The calculation would be wrong if you messed with settings in cacti like heartbeatvalue of your datasource, but I doubt you did this ....
You're right- I didn't change anything And all other SMP boxes still report the right values according to the # of CPUs.
Maanwhile I deleted the whole CPU graph and rebuild in from scratch. Still the same result- only approx 200 is shown. Interestingly enough, my CDEF function calculated nearly perfect the 400s in summary. Now it always is above 200...(ca. 215)>
gandalf wrote:Wow, that's nice. Never saw this before.
Me neither
Please be aware of the fact, that (at least net-snmp) has some trouble with CPU data. Latest net-snmp (5.4) changed CPU measures.
Funny enough that it changes from time to time. I checked my yum.log- the latest snmp update was installed aprox. a month before the issue appeared. And there have been reboots after that- so why does it change suddenly?
Anyway: I checked some output from snmpwalk and it looks like you're on the right track. When I calculate all differences during the 5 minute time period they divided by the 300secs I'm getting a summary of ~214. Which seems to match the graph value.
So it looks like Cacti is doing good- btu the net-snmp stuff is bogus.
As you recommend I'll see if I can upgrade to 5.4 through rpmforge. Additional question: what can cause this issue?
gandalf wrote:Wow, that's nice. Never saw this before.
Please be aware of the fact, that (at least net-snmp) has some trouble with CPU data. Latest net-snmp (5.4) changed CPU measures.
Reinhard
I upgraded to net-snmp-5.4.x.svn200812050230-1.1.i386.rpm.
Talking to myself now, hoping someone can help me.
As already stated I installed net-snmp-5.4.x but the issues stays.
Now I performed a snmpwalk every five minutes and checked the output.
So I calculated the difference between each value and added it up. The sum was then divided by 300 (secs). The result always stays at approx 210. Sometimes it may be 215 or so. But I'd expect the sum being at 400.
So again, it doesn't seem to be Cacti- is there another place where to ask for the net-snmp issue?