Best way to deal with NAN values

Post general support questions here that do not specifically fall into the Linux or Windows categories.

Moderators: Developers, Moderators

Post Reply
gadams666
Posts: 2
Joined: Thu Apr 25, 2002 11:43 am

Best way to deal with NAN values

Post by gadams666 »

Hi there,

First off, Cacti is an *excellent* tool! Easy setup, use, and after figuring out a stupid error in my crontab, I'm graphing status on my management system.

I did a quick look and didn't see any responses a problem I'm having.

I'm measuring the response time on a service (monitored by NetSaint/Nagios) that *normally* returns a parsable time value, unless there was a valid response in a SQL database (every click costs me money). So from a Cacti perspecitive, my data input script parses the following from the NetSaint log file (each line return per 5 minute query)

Trans - RTA 1.004
Trans - RTA 1.2343
Transaction in last 15 minutes
Trans - RTA 2.932

What I would like to do is obviously enter in the data for 1st, 2nd, and 4th queries. However, for the 3rd one, I'd like to either interpolate the 2nd value (1.2343) or just NAN / U the value in the .rrd file. For graphing I'm not worried if there are gaps for the non-response time values, but if I do any overall averages, I want to make sure they are not included in the calculations.

Any ideas on how to tackle this problem?
gwynnebaer
Posts: 35
Joined: Sat Apr 13, 2002 5:16 pm
Location: Santa Barbara, CA

Re: Best way to deal with NAN values

Post by gwynnebaer »

gadams666 wrote:I'm measuring the response time on a service (monitored by NetSaint/Nagios) that *normally* returns a parsable time value, unless there was a valid response in a SQL database (every click costs me money). So from a Cacti perspecitive, my data input script parses the following from the NetSaint log file (each line return per 5 minute query)

Trans - RTA 1.004
Trans - RTA 1.2343
Transaction in last 15 minutes
Trans - RTA 2.932

What I would like to do is obviously enter in the data for 1st, 2nd, and 4th queries. However, for the 3rd one, I'd like to either interpolate the 2nd value (1.2343) or just NAN / U the value in the .rrd file. For graphing I'm not worried if there are gaps for the non-response time values, but if I do any overall averages, I want to make sure they are not included in the calculations.
I'm not familiar with the data you are importing, but can I assume that what you enter at an update is: 1.004, or is it the whole string "Trans - RTA 1.004"? If you have to parse that down to a number, then alter your parser to return either the valid number, or a number that is outside the range of your input. For example, if the largest valid RTA you might ever get is "10.000", then set the return for "Transaction in last 15 minutes" to 20.000 and set the maximum valid data field to 19 or something less than 20. Does that make sense?

If all you get from NetSaint is a number, then you might wrap this in a script that does this.

If you are really entering "Trans - RTA 1.004" as your data input, I'm not entirely sure how rrdtool interprets that.

There may certainly be a better way to do this. Any thoughts from the gallery?

-gwynnebaer
gadams666
Posts: 2
Joined: Thu Apr 25, 2002 11:43 am

Re: Best way to deal with NAN values

Post by gadams666 »

gwynnebaer wrote: I'm not familiar with the data you are importing, but can I assume that what you enter at an update is: 1.004, or is it the whole string "Trans - RTA 1.004"? If you have to parse that down to a number, then alter your parser to return either the valid number, or a number that is outside the range of your input. For example, if the largest valid RTA you might ever get is "10.000", then set the return for "Transaction in last 15 minutes" to 20.000 and set the maximum valid data field to 19 or something less than 20. Does that make sense?

If all you get from NetSaint is a number, then you might wrap this in a script that does this.

If you are really entering "Trans - RTA 1.004" as your data input, I'm not entirely sure how rrdtool interprets that.

There may certainly be a better way to do this. Any thoughts from the gallery?

-gwynnebaer
Ah, should have been more clear on that. My script is parsing the values (i.e., I strip out the 1.004) on those entries that return values. Most of the time I'll get back some value that I can parse out. However, in some cases I'll see that a transaction exists in a database, in which case I don't have to make a call myself ($$$). These are the times I want to either enter an UNK value or use rrdtool to grab the last value inserted into the RRD.

In reading the Cacti docs, in appears I *must* provide a value within the min/max defined on the collection page. This is what I'm trying resolve.

I can do an 'rrdtool fetch' on the AVERAGE cf, but is there a way to read the last non-UNK value? That would be a way for me to deal with a missing value when pasing the NetSaint entries.

Thansk for the response, hopefully this clears things up a little!
gwynnebaer
Posts: 35
Joined: Sat Apr 13, 2002 5:16 pm
Location: Santa Barbara, CA

Post by gwynnebaer »

I see what you mean. Here's what you can do:

Code: Select all

rrdtool fetch your_data_file.rrd AVERAGE -s -300
This will return something like this:

Code: Select all

1019778300:  1.0019047619e+00
1019778600:  9.5695555556e-01
1019778900:  NaN
You could then parse this by taking the last line that doesn't say NaN:

Code: Select all

#!/bin/sh

GOODVAL=`rrdtool fetch your_data_file.rrd AVERAGE -s -300 | grep -v NaN | tail -1 | awk '{print $2}'`

Then feed that response back into rrdtool update and you are set for duplicate information.

You may need to set the "-300" back to something higher if you are worried that you will get all NaN numbers backwards in time farther than 300 seconds.

See the manpage for rrdtool fetch for more information.

-gwynnebaer
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest