I have discovered an annoying issue with one of my Cacti hosts.
It is a Cisco Wireless Controller and some of the associated SNMP queries are not behaving. The graphs are showing as NaN. The query being used is identical to that being used by another Cisco WLC and the graphs for that host work fine. Both WLCs are identical models and running identical software versions.
I have tried removing the host and re-adding but the same thing happens.
Using RRDTOOL I get the following output for the BAD HOST:
Code: Select all
filename = "bad_host_apassoc_1413.rrd"
rrd_version = "0003"
step = 300
last_update = 1422266462
header_size = 2040
ds[apassoc].index = 0
ds[apassoc].type = "GAUGE"
ds[apassoc].minimal_heartbeat = 600
ds[apassoc].min = 0.0000000000e+00
ds[apassoc].max = 1.0000000000e+12
ds[apassoc].last_ds = "U"
ds[apassoc].value = NaN
ds[apassoc].unknown_sec = 62
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].cur_row = 577
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 700
rra[1].cur_row = 210
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 0.0000000000e+00
rra[1].cdp_prep[0].unknown_datapoints = 0
rra[2].cf = "AVERAGE"
rra[2].rows = 775
rra[2].cur_row = 667
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 0.0000000000e+00
rra[2].cdp_prep[0].unknown_datapoints = 0
rra[3].cf = "AVERAGE"
rra[3].rows = 797
rra[3].cur_row = 704
rra[3].pdp_per_row = 288
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 0.0000000000e+00
rra[3].cdp_prep[0].unknown_datapoints = 119
rra[4].cf = "MAX"
rra[4].rows = 600
rra[4].cur_row = 430
rra[4].pdp_per_row = 1
rra[4].xff = 5.0000000000e-01
rra[4].cdp_prep[0].value = NaN
rra[4].cdp_prep[0].unknown_datapoints = 0
rra[5].cf = "MAX"
rra[5].rows = 700
rra[5].cur_row = 293
rra[5].pdp_per_row = 6
rra[5].xff = 5.0000000000e-01
rra[5].cdp_prep[0].value = -inf
rra[5].cdp_prep[0].unknown_datapoints = 0
rra[6].cf = "MAX"
rra[6].rows = 775
rra[6].cur_row = 59
rra[6].pdp_per_row = 24
rra[6].xff = 5.0000000000e-01
rra[6].cdp_prep[0].value = -inf
rra[6].cdp_prep[0].unknown_datapoints = 0
rra[7].cf = "MAX"
rra[7].rows = 797
rra[7].cur_row = 218
rra[7].pdp_per_row = 288
rra[7].xff = 5.0000000000e-01
rra[7].cdp_prep[0].value = 0.0000000000e+00
rra[7].cdp_prep[0].unknown_datapoints = 119
Code: Select all
filename = "good_host_apassoc_964.rrd"
rrd_version = "0003"
step = 300
last_update = 1422266755
header_size = 2040
ds[apassoc].index = 0
ds[apassoc].type = "GAUGE"
ds[apassoc].minimal_heartbeat = 600
ds[apassoc].min = 0.0000000000e+00
ds[apassoc].max = 1.0000000000e+12
ds[apassoc].last_ds = "32"
ds[apassoc].value = 1.7600000000e+03
ds[apassoc].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].cur_row = 402
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 700
rra[1].cur_row = 491
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 3.1640000000e+01
rra[1].cdp_prep[0].unknown_datapoints = 0
rra[2].cf = "AVERAGE"
rra[2].rows = 775
rra[2].cur_row = 71
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 3.1640000000e+01
rra[2].cdp_prep[0].unknown_datapoints = 0
rra[3].cf = "AVERAGE"
rra[3].rows = 797
rra[3].cur_row = 220
rra[3].pdp_per_row = 288
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 4.8022666667e+02
rra[3].cdp_prep[0].unknown_datapoints = 0
rra[4].cf = "MAX"
rra[4].rows = 600
rra[4].cur_row = 493
rra[4].pdp_per_row = 1
rra[4].xff = 5.0000000000e-01
rra[4].cdp_prep[0].value = NaN
rra[4].cdp_prep[0].unknown_datapoints = 0
rra[5].cf = "MAX"
rra[5].rows = 700
rra[5].cur_row = 393
rra[5].pdp_per_row = 6
rra[5].xff = 5.0000000000e-01
rra[5].cdp_prep[0].value = 3.1640000000e+01
rra[5].cdp_prep[0].unknown_datapoints = 0
rra[6].cf = "MAX"
rra[6].rows = 775
rra[6].cur_row = 292
rra[6].pdp_per_row = 24
rra[6].xff = 5.0000000000e-01
rra[6].cdp_prep[0].value = 3.1640000000e+01
rra[6].cdp_prep[0].unknown_datapoints = 0
rra[7].cf = "MAX"
rra[7].rows = 797
rra[7].cur_row = 444
rra[7].pdp_per_row = 288
rra[7].xff = 5.0000000000e-01
rra[7].cdp_prep[0].value = 3.2000000000e+01
rra[7].cdp_prep[0].unknown_datapoints = 0
Code: Select all
snip...
1422262500: -nan
1422262800: -nan
1422263100: -nan
1422263400: -nan
1422263700: -nan
1422264000: -nan
1422264300: -nan
1422264600: -nan
1422264900: -nan
1422265200: -nan
1422265500: -nan
1422265800: -nan
1422266100: -nan
1422266400: 0.0000000000e+00
1422266700: -nan
1422267000: -nan
1422267300: -nan
1422267600: -nan
1422267900: -nan
1422268200: -nan
Code: Select all
01/23/2015 04:11:03 PM - CMDPHP: Poller[0] Host[106] DS[1400] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 1.1.1.1 mycommunity 2 get apassoc 28:34:A2:0C:BB:90, output: 10
01/23/2015 04:11:03 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/bad_host_apassoc_1400.rrd --template apassoc 1422029462:U
01/23/2015 04:11:03 PM - CMDPHP: Poller[0] Host[106] DS[1399] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 1.1.1.1 mycommunity 2 get apassoc 28:34:A2:7E:ED:80, output: 3
01/23/2015 04:11:03 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/bad_host_apassoc_1399.rrd --template apassoc 1422029462:U
01/23/2015 04:11:03 PM - CMDPHP: Poller[0] Host[106] DS[1398] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 1.1.1.1 mycommunity 2 get apassoc 28:34:A2:4E:E3:80, output: 2
01/23/2015 04:11:03 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/bad_host_apassoc_1398.rrd --template apassoc 1422029462:U
01/23/2015 04:11:03 PM - CMDPHP: Poller[0] Host[106] DS[1397] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 1.1.1.1 mycommunity 2 get apassoc 28:34:A2:0C:56:60, output: 4
01/23/2015 04:11:03 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/bad_host_apassoc_1397.rrd --template apassoc 1422029462:U
01/23/2015 04:11:04 PM - CMDPHP: Poller[0] Host[106] DS[1396] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 1.1.1.1 mycommunity 2 get apassoc 28:34:A2:7E:EC:E0, output: 0
Code: Select all
01/23/2015 04:10:54 PM - CMDPHP: Poller[0] Host[67] DS[892] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 2.2.2.2 mycommunity 2 get apassoc 34:BD:C8:AC:1D:40, output: 0
01/23/2015 04:10:54 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/good_host_apassoc_892.rrd --template apassoc 1422029454:0
01/23/2015 04:10:54 PM - CMDPHP: Poller[0] Host[67] DS[894] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 2.2.2.2 mycommunity 2 get apassoc 88:75:56:DA:8D:E0, output: 8
01/23/2015 04:10:54 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/good_host_apassoc_894.rrd --template apassoc 1422029454:8
01/23/2015 04:10:55 PM - CMDPHP: Poller[0] Host[67] DS[895] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 2.2.2.2 mycommunity 2 get apassoc 88:75:56:DB:3A:00, output: 1
01/23/2015 04:10:55 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/good_host_apassoc_895.rrd --template apassoc 1422029454:1
01/23/2015 04:10:55 PM - CMDPHP: Poller[0] Host[67] DS[896] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 2.2.2.2 mycommunity 2 get apassoc 88:75:56:DB:3B:80, output: 11
01/23/2015 04:10:55 PM - POLLER: Poller[0] CACTI2RRD: /usr/local/rrdtool/bin/rrdtool update /var/www/html/cacti/rra/good_host_apassoc_896.rrd --template apassoc 1422029454:11
01/23/2015 04:10:55 PM - CMDPHP: Poller[0] Host[67] DS[897] CMD: /usr/bin/php -q /var/www/html/cacti/scripts/Cisco_WLC_APs_Assoc.php 2.2.2.2 mycommunity 2 get apassoc 88:75:56:DB:40:90, output: 0
Running the script manually as the cacti user yields proper results and the OIDs etc are correct for both hosts.
I have done the following steps in gandalfs troubleshooting guide:
- Check Cacti Log File - no errors regarding SNMP
- Check Basic Data Gathering - no problem with the script as it outputs as expected and is working for GOOD HOST
- Check Cacti's Poller - Poller is running script correctly and expected results are being returned (see logs above)
- Check MySQL Updating - No issue as far as I can see, again other hosts are fine
- Check RRD File Updating - Yes, see logs above
- Check RRD File Numbers - All values are between MIN and MAX, although it looks like one value made it into the rrd file (See above)
- Check rrdtool Graph Statement - The graph statement looks fine
Regards