This is the Next Generation Linux CPU Graph: As a bonus you get this I/O graph as well: Compare this to the original:
Data query: (install in resource/snmp_queries) Templates: (Import in Cacti) The host above is a 4 core server, performing a 16 thread mysqldump import.
Q: How did you get the core estimation as well as the graph levelled at 100%?
A: Some CDEF magic... I spent the major part of a good night getting this to work and I almost lost my sanity in the process but I made it... I was very determined and very sick of the standard Cacti graph over the past few years.
Q: Seriously, no need to specify or choose between different graph templates depending on how many cores a server have?
A: NO!!!! It's almost like magic! But if you try to figure the CDEF out you might go bananas.
No documentation, no nothing... atleast not yet. But I sure know I'm alot happier with these than the standard cacti CPU graph which hasn't received much love for the past few years.
INSTALLATION
1. copy systemstats.xml* to resource/snmp_queries/systemstats.xml
2. import cacti_data_query_fridh_-_ucdnet_-_systemstats.xml*
3. Assign data query to a host or a host template.
4. Create new graphs.
(All needed files are attached to this post).
The templates should work on all hosts, no matter how many cpus.
To actually get decent graphs with the existing templates you had to create individual templates that either made it clear how many CPUs each machine had, or create individual graph templates for 2, 4, 8 and 16 core machines with CDEFs dividing all values by #of CPUs.
The graph should peak at 100% no matter how many logical CPUs.
Example: Previous graphs peak at 800% for an 8-core, and you would not know if 100% on a graph is a 1-core server at its peak or an 8-core server at a mere 12.5% CPU usage.
Present as much information as possible in the graph, without cluttering it.
This means, include User, System, Wait etc.
The number of logical CPUs should be deductible from viewing the graph.
So far, I think I met all these goals, and I hope this post can be updated to contain a final working version eventually, but right now there's a problem with repeatable behaviour on creating graphs in different cacti installations. The workaround is: If you have issues, you can fix the CDEF once and then all graphs in that particular Cacti installation will work.
BUGS
Basically, there seems to be an issue (i.e. functional difference) in the few Cacti installations I've tried.
On Cacti installation 1 the graphs ended up with 7 DEFs. (only the AVERAGE one)
On Cacti installation 2 the graphs ended up with 14 DEFs. (AVERAGE, MAX)
On Cacti installation 3 the graphs ended up with 28 DEFs. (AVERAGE, LAST, MIN, MAX)
What this means is that the CDEF will fail to work properly, and you have to adjust the graph template before adding it to hosts.
THE PROBLEM IN MORE DETAIL:
I have a CDEF that gives me the total CPU Usage as follows:
cdef=a,b,c,d,e,f,+,+,+,+,+,ALL_DATA_SOURCES_NODUPS,100,/,/
Obviously this relies on DEF a, b, c, d, e, f being the correct ones in the EXACT correct place, which they're not in these cases...
The original Version 0.8.7e Cacti where I exported the templates; graph debug shows the following DEFs:
Code: Select all
DEF:a="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawSystem:AVERAGE \
DEF:b="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawUser:AVERAGE \
DEF:c="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawNice:AVERAGE \
DEF:d="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawInterrupt:AVERAGE \
DEF:e="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawWait:AVERAGE \
DEF:f="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawSoftIRQ:AVERAGE \
DEF:g="/usr/local/share/cacti-0.8.7e/rra/web22_sscpurawsystem_260.rrd":ssCpuRawIdle:AVERAGE \
Code: Select all
DEF:a="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawSystem:AVERAGE \
DEF:b="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawSystem:MAX \
DEF:c="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawUser:AVERAGE \
DEF:d="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawUser:MAX \
DEF:e="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawNice:AVERAGE \
DEF:f="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawNice:MAX \
DEF:g="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawInterrupt:AVERAGE \
DEF:h="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawInterrupt:MAX \
DEF:i="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawWait:AVERAGE \
DEF:j="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawWait:MAX \
DEF:ba="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawSoftIRQ:AVERAGE \
DEF:bb="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawSoftIRQ:MAX \
DEF:bc="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawIdle:AVERAGE \
DEF:bd="/var/lib/cacti/rra/snmplocal_sscpurawnice_145.rrd":ssCpuRawIdle:MAX \
Code: Select all
cdef=a,c,e,g,i,ba,+,+,+,+,+,ALL_DATA_SOURCES_NODUPS,100,/,/
Code: Select all
DEF:a="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSystem:AVERAGE \
DEF:b="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSystem:LAST \
DEF:c="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSystem:MIN \
DEF:d="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSystem:MAX \
DEF:e="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawUser:AVERAGE \
DEF:f="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawUser:LAST \
DEF:g="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawUser:MIN \
DEF:h="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawUser:MAX \
DEF:i="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawNice:AVERAGE \
DEF:j="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawNice:LAST \
DEF:ba="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawNice:MIN \
DEF:bb="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawNice:MAX \
DEF:bc="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawInterrupt:AVERAGE \
DEF:bd="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawInterrupt:LAST \
DEF:be="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawInterrupt:MIN \
DEF:bf="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawInterrupt:MAX \
DEF:bg="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawWait:AVERAGE \
DEF:bh="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawWait:LAST \
DEF:bi="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawWait:MIN \
DEF:bj="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawWait:MAX \
DEF:ca="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSoftIRQ:AVERAGE \
DEF:cb="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSoftIRQ:LAST \
DEF:cc="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSoftIRQ:MIN \
DEF:cd="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawSoftIRQ:MAX \
DEF:ce="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawIdle:AVERAGE \
DEF:cf="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawIdle:LAST \
DEF:cg="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawIdle:MIN \
DEF:ch="/opt/statistics/rra/labdb01_sscpurawsystem_229.rrd":ssCpuRawIdle:MAX \
Maybe this issue does not occur for you at all.
I'm happy for any feedback on this issue or general feedback on whether or not the graphs work for you.