Hello
I am collection statistical information using SNMP queries, and I have noticed that when using the aggregated graph that if a device reboots or is unreachable from Cacti. The aggregated graph stops updating. I am assuming this is because there is no data present as the polling is failing.
Is there anyway I can set a "failure" value in this particular case? Or will I have to create a script which does these sort of health checks?
NaN values from poller, values from manual operations
Moderators: Developers, Moderators
NaN values from poller, values from manual operations
Last edited by rasekm on Fri Aug 31, 2018 4:51 am, edited 1 time in total.
Re: Gap in aggregated graphs when 1 device timesout
I created a PHP script in accordance to the Script Server layout. But when I am using it one of the parts of the graph responds with a "NaN". I have tried the same argument outside of Cacti and I am in fact getting a non-NaN value back.
I am even getting a NaN value when I hardcode the PHP script to return a value (i.e. 5) no matter what I am asking of the script.
If I run a "realtime" polling of the specific device I am getting back the hardcoded and correct values. But for some reason in the aggregated graph I am not getting any other values said device other than NaN.
Any ideas?
I am even getting a NaN value when I hardcode the PHP script to return a value (i.e. 5) no matter what I am asking of the script.
If I run a "realtime" polling of the specific device I am getting back the hardcoded and correct values. But for some reason in the aggregated graph I am not getting any other values said device other than NaN.
Any ideas?
Re: Gap in aggregated graphs when 1 device timesout
My interpretation from the debug statement bellow is that the CMDPHP inserts values for dev2 when PHPSVR haven't done the polling of dev2. How can I solve this?
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] PHP Script Server has Started - Parent is cmd
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] DEBUG: PID[19057] CTR[0] INC: 'ss_ip_sla.php' FUNC: 'ss_ip_sla' PARMS: 'dev1 294 2:161:500:1:10:public:::::: get pkt_loss 1'
08/31/2018 11:07:35 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output_rt (local_data_id, rrd_name, time, poller_id, output) values (32386, 'dev1_pkt_loss', '2018-08-31 11:07:35', '37809', '0')"
08/31/2018 11:07:35 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output_rt (local_data_id, rrd_name, time, poller_id, output) values (32385, 'dev2_pkt_loss', '2018-08-31 11:07:35', '37809', 'U')"
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] DEBUG: PID[19057] CTR[0] RESPONSE:''
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] DEBUG: PID[19057] CTR[1] INC: 'ss_ip_sla.php' FUNC: 'ss_ip_sla' PARMS: 'dev2 2560 2:1612:30:public:::::: get pkt_loss 1'
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] PHP Script Server has Started - Parent is cmd
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] DEBUG: PID[19057] CTR[0] INC: 'ss_ip_sla.php' FUNC: 'ss_ip_sla' PARMS: 'dev1 294 2:161:500:1:10:public:::::: get pkt_loss 1'
08/31/2018 11:07:35 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output_rt (local_data_id, rrd_name, time, poller_id, output) values (32386, 'dev1_pkt_loss', '2018-08-31 11:07:35', '37809', '0')"
08/31/2018 11:07:35 AM - CMDPHP: Poller[0] DEVEL: SQL Exec: "insert into poller_output_rt (local_data_id, rrd_name, time, poller_id, output) values (32385, 'dev2_pkt_loss', '2018-08-31 11:07:35', '37809', 'U')"
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] DEBUG: PID[19057] CTR[0] RESPONSE:''
08/31/2018 11:07:35 AM - PHPSVR: Poller[0] DEBUG: PID[19057] CTR[1] INC: 'ss_ip_sla.php' FUNC: 'ss_ip_sla' PARMS: 'dev2 2560 2:1612:30:public:::::: get pkt_loss 1'
-
- Posts: 39
- Joined: Fri Mar 18, 2016 8:49 am
Re: NaN values from poller, values from manual operations
If you have changed the script and in particular the way the variables are output from the script into your graph template then try ripping out the graph and rebuilding it.
Re: NaN values from poller, values from manual operations
I have done this numerous times, I can get the "NaN" values to be fixed very temporarily by issuing a realtime polling of the NaN value in the graph. However this tactic doesn't hold longer than 30seconds at a time.RustedKnight wrote:If you have changed the script and in particular the way the variables are output from the script into your graph template then try ripping out the graph and rebuilding it.
This is my script server script
Code: Select all
function ss_ip_sla($hostname, $host_id, $snmp_auth, $cmd, $arg1 = "", $arg2 = "") {
$snmp = explode(":", $snmp_auth);
$snmp_version = $snmp[0];
$snmp_port = $snmp[1];
$snmp_timeout = $snmp[2];
$ping_retries = $snmp[3];
$max_oids = $snmp[4];
$snmp_auth_username = "";
$snmp_auth_password = "";
$snmp_auth_protocol = "";
$snmp_priv_passphrase = "";
$snmp_priv_protocol = "";
$snmp_context = "";
$snmp_community = "";
if ($snmp_version == 3) {
$snmp_auth_username = $snmp[6];
$snmp_auth_password = $snmp[7];
$snmp_auth_protocol = $snmp[8];
$snmp_priv_passphrase = $snmp[9];
$snmp_priv_protocol = $snmp[10];
$snmp_context = $snmp[11];
}else{
$snmp_community = $snmp[5];
}
$oids = array(
"pkt_loss",
"index" => ".1.3.6.1.4.1.9.9.42.1.2.1.1.3",
"tag" => ".1.3.6.1.4.1.9.9.42.1.2.1.1.3",
"vrf" => ".1.3.6.1.4.1.9.9.42.1.2.2.1.26",
"admintimeout" => ".1.3.6.1.4.1.9.9.42.1.2.1.1.7"
);
if ($cmd == "index") {
$return_arr = ss_ip_sla_index(cacti_snmp_walk($hostname, $snmp_community, $oids["index"], $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol, $snmp_priv_passphrase, $snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, $max_oids, SNMP_POLLER));
foreach($return_arr as $index => $value){
print_r($index."\n");
}
}elseif($cmd == "admin_tag"){
$indexes = ss_ip_sla_index(cacti_snmp_walk($hostname, $snmp_community, $oids["tag"], $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol, $snmp_priv_passphrase, $snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, $max_oids, SNMP_POLLER));
foreach($indexes as $oid => $value){
print $oid . "!" . $value . "\n";
}
}elseif($cmd == "vrf"){
$indexes = ss_ip_sla_index(cacti_snmp_walk($hostname, $snmp_community, $oids["vrf"], $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol, $snmp_priv_passphrase, $snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, $max_oids, SNMP_POLLER));
foreach($indexes as $oid => $value){
print $oid . "!" . $value . "\n";
}
}elseif($cmd == "admintimeout"){
$indexes = ss_ip_sla_index(cacti_snmp_walk($hostname, $snmp_community, $oids["admintimeout"], $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol, $snmp_priv_passphrase, $snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, $max_oids, SNMP_POLLER));
foreach($indexes as $oid => $value){
print $oid . "!" . $value . "\n";
}
}elseif ($cmd == "num_indexes") {
$return_arr = ss_ip_sla_index(cacti_snmp_walk($hostname, $snmp_community, $oids["index"], $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol, $snmp_priv_passphrase, $snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, $max_oids, SNMP_POLLER));
print sizeof($return_arr);
}elseif ($cmd == "query") {
$arg = $arg1;
$indexes = ss_ip_sla_index(cacti_snmp_walk($hostname, $snmp_community, $oids[$arg], $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol, $snmp_priv_passphrase, $snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, $max_oids, SNMP_POLLER));
if($arg == "index"){
foreach($indexes as $index => $value){
print_r($index."!".$index."\n");
}
}else{
foreach($indexes as $oid => $value){
print $oid . "!" . $value . "\n";
}
}
}elseif ($cmd == "get") {
$arg = $arg1;
$index = $arg2;
if($arg == "pkt_loss"){
$pkt_loss_sd = cacti_snmp_get($hostname, $snmp_community, ".1.3.6.1.4.1.9.9.42.1.5.2.1.26" . ".$index", $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol,$snmp_priv_passphrase,$snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, SNMP_POLLER);
$pkt_loss_ds = cacti_snmp_get($hostname, $snmp_community, ".1.3.6.1.4.1.9.9.42.1.5.2.1.27" . ".$index", $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol,$snmp_priv_passphrase,$snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, SNMP_POLLER);
$pkt_loss_mia = cacti_snmp_get($hostname, $snmp_community, ".1.3.6.1.4.1.9.9.42.1.5.2.1.29" . ".$index", $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol,$snmp_priv_passphrase,$snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, SNMP_POLLER);
$pkt_oos = cacti_snmp_get($hostname, $snmp_community, ".1.3.6.1.4.1.9.9.42.1.5.2.1.28" . ".$index", $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol,$snmp_priv_passphrase,$snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, SNMP_POLLER);
$pkt_la = cacti_snmp_get($hostname, $snmp_community, ".1.3.6.1.4.1.9.9.42.1.5.2.1.30" . ".$index", $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol,$snmp_priv_passphrase,$snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, SNMP_POLLER);
$num_rtt = cacti_snmp_get($hostname, $snmp_community, ".1.3.6.1.4.1.9.9.42.1.5.2.1.1" . ".$index", $snmp_version, $snmp_auth_username, $snmp_auth_password, $snmp_auth_protocol,$snmp_priv_passphrase,$snmp_priv_protocol, $snmp_context, $snmp_port, $snmp_timeout, $ping_retries, SNMP_POLLER);
#Sanity Check
if(
($pkt_loss_sd == "") ||
($pkt_loss_ds == "") ||
($pkt_loss_mia = "") ||
($pkt_oos == "") ||
($pkt_la == "") ||
($num_rtt == "")
){
print("1\n");
}else{
$cisco_ipsla_pkt_loss = ($pkt_loss_sd+$pkt_loss_ds+$pkt_loss_mia)/($pkt_loss_sd+$pkt_loss_ds+$pkt_loss_mia+$pkt_oos+$pkt_la+$num_rtt);
$ipsla_responder_down = (1-($num_rtt/10));
$downtime = $cisco_ipsla_pkt_loss + $ipsla_responder_down;
print($downtime."\n");
}
}
}
}
function ss_ip_sla_index($arr){
$return_arr = array();
foreach($arr as $a){
preg_match("/\d*$/",$a["oid"],$matches);
$return_arr[$matches[0]] = $a["value"];
}
return $return_arr;
}
Re: NaN values from poller, values from manual operations
1535709900: -nan
1535710200: -nan
1535710500: -nan
1535710800: -nan
1535711100: -nan
1535711400: -nan
1535711700: -nan
1535712000: 0.0000000000e+00
1535712300: -nan
1535712600: 0.0000000000e+00
1535712900: 0.0000000000e+00
1535713200: -nan
This is a snippet from my rra file. I am seeing in the debug lines that the value U is inserted. But I cannot reproduce this code, I am always getting back a number (when I run the same commands from prompt).
1535710200: -nan
1535710500: -nan
1535710800: -nan
1535711100: -nan
1535711400: -nan
1535711700: -nan
1535712000: 0.0000000000e+00
1535712300: -nan
1535712600: 0.0000000000e+00
1535712900: 0.0000000000e+00
1535713200: -nan
This is a snippet from my rra file. I am seeing in the debug lines that the value U is inserted. But I cannot reproduce this code, I am always getting back a number (when I run the same commands from prompt).
Re: NaN values from poller, values from manual operations
Check the poller output cache, see what is calling this script for that device and run those exact commands as the user the poller is being run as.
Cacti Developer & Release Manager
The Cacti Group
Director
BV IT Solutions Ltd
+--------------------------------------------------------------------------+
Cacti Resources:
Cacti Website (including releases)
Cacti Issues
Cacti Development Releases
Cacti Development Documentation
The Cacti Group
Director
BV IT Solutions Ltd
+--------------------------------------------------------------------------+
Cacti Resources:
Cacti Website (including releases)
Cacti Issues
Cacti Development Releases
Cacti Development Documentation
Who is online
Users browsing this forum: No registered users and 5 guests