NetApp Filer: graphing Performance Stats and IO's (template)

eschoeller · Post by **eschoeller** » Sun Aug 10, 2014 3:14 pm

There are several API functions that have changed slightly on the newest version of the NetApp appliances. There are also new requirements to use iterative API's and specifically define instances which was not the case in the past. I am using a Cluster On TAP 8.2.1 system.

I've fixed up the perl script to work with clusters. I'm also making some changes to the graphs. I don't know how much effort I'm going to invest in it though. When I'm done, I'll post the results.

@um3n:
Here is some documentation coming back directly from the perf-object-counter-list-info API on the filer:

'content' => 'Average latency in microseconds for the WAFL filesystem to process read request to the volume; not including request processing or network communication time'
'name' => 'properties' 'content' => 'average',
'name' => 'unit' 'content' => 'microsec',

I understand what gheppner is doing, but I had to give it some thought. It's all about how you choose to represent the data. Neither approach is technically incorrect, you just need to understand what you're looking at.
It's somewhat complicated to understand, but the value coming back from avg_latency, read_latency and write_latency is a COUNTER of the number of microseconds of latency that has occurred since some arbitrary point in the past (system reboot perhaps, or counter roll-over). This is quite typical of how most storage systems report latency. To see this in real-time run a command like this:

Code: Select all

watch -n 1 ./netapp-ontapsdk-perf.pl FILER USER PASSWORD volume get write_latency VOLUME

You'll watch the number of microseconds increase slowly (or quickly) depending on how much latency is being generated. Now all Cacti does is read this COUNTER once every minute (or 5 minutes), factor in the step value of 60 (or 300) and compute a "microseconds of latency per second" value to put up on a graph.

What gheppner did was simply modify this to produce a "microseconds of latency per second per operation" value and graph it. This may be more indicative of what the filer reports through CLI commands ... but I wouldn't know ... I don't have access to the filer's CLI.

In your particular case what's really screwing with the graph is the other_latency. It's certainly larger than the read/write. I haven't investigated what other_latency really is. I'm working with a brand-new storage system that has no production traffic on it, so it's also difficult to compare to what I'm capturing now. Looking back into ancient history at our older 3040C data I would typically have other_latency: 300 write_latency: 170 read_latency: 50 , while the system was in production. These were functioning as storage back-ends for a large mail system. It's entirely possible that you indeed have an application / storage system with a large amount of latency. But overall, you're still going to see trends in latency - it will go up, and go down, and when it goes up you should take notice

eschoeller · Post by **eschoeller** » Sun Aug 10, 2014 3:19 pm

: graph_1376_all.png (54.7 KiB) Viewed 3386 times

There is one of our filers for most of it's service lifetime. You can see a dip down towards the end when we migrated to the new mail system.

um3n · Post by **um3n** » Mon Aug 11, 2014 3:04 am

Hey,

thank you for this detailed answer

We had the same idea with that "counter", but my cactiskills are not high enough to build a CDEFString to convert the dataoutput.

But i will try that later.

andersonleite · Post by **andersonleite** » Thu Aug 13, 2015 4:42 am

Hello there,

Somebody has news about this template?

I'm trying to use that and get different results via command line and no results at all via cacti.

When I try that, I got answers:

Code: Select all

 /usr/share/cacti/site/scripts/netapp-get-data.sh myip "login" "password" volume index
vol0
volfile01
volfile03
volfile02

But when I try that, nothing happens (still running, like 10 minutes now):

Code: Select all

/usr/share/cacti/site/scripts/netapp-get-data.sh myip "login" "password" volume query index

satishautodesk · Post by **satishautodesk** » Wed Aug 19, 2015 3:55 am

can anyone please help us here !!!

um3n · Post by **um3n** » Wed Aug 19, 2015 6:22 am

i am not sure, i played with these graphs a long time ago, but i think the usage is a littlebit different.

try something like

Code: Select all

/usr/share/cacti/site/scripts/netapp-get-data.sh netapp03.dbh.de USER PASSWORD volume get

[root@cacti /usr/share/cacti/site/scripts] # ./netapp-get-data.sh

netapp-ontapsdk-perf.pl <filer> <user> <password> <objname> <operation> <counter> <instance>

<filer>     -- Filer name
<user>      -- User name
<password>  -- Password
<objname>   -- Perf Object name
<operation> -- Operation to be performed:
        object-list   - Get the list of perforance objects in the system
        index - Get the list of instances for a given performance object
        query - Get the values of the counters for the instance of a performance object
        get   - Get the values of the counters for all the instances of a performance object
        counter-list  - Get the list of counters available for a given performance object
[<counter>]  -- Performance counter name
[<instance>]  -- Instance name (ex: volume name)

[root@cacti /usr/share/cacti/site/scripts] # /usr/share/cacti/site/scripts/netapp-get-data.sh netapp03.dbh.de cacti 31hGmmfo volume query index
vol0:vol0
vol_vm_GAPGAPGAP:vol_vm_GAPGAPGAP
vol_vm_GAPGAPGAP:vol_vm_GAPGAPGAP
vol_vm_GAPGAPGAP:vol_vm_GAPGAPGAP
vol_vm_GAPGAPGAP:vol_vm_GAPGAPGAP

EDIT:
Ah i see... you do it like that...
Maybe there is something bitween your cacti and your netapp that drops the packets?

Which communication do you use? HTTP or HTTPS?
Are the important settings from your netapp correct?

amitbenade · Post by **amitbenade** » Wed Jan 18, 2017 9:13 am

Hi Gurus,
After I follow the steps given and successfully importing Host template I get below error,
[root@monvm cacti]# perl /usr/share/cacti/scripts/na-cacti-vol-latency.pl -H 10.45.13.102 -u user -p password -a query index
Invoke failed! Reason: either instances or instance-uuids must be given

I am not sure what I am doing wrong and there is not much help available with debugging this!

Regards,
Amit

Cacti

NetApp Filer: graphing Performance Stats and IO's (template)

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Re: NetApp Filer: graphing Performance Stats and IO's (templ

Who is online