Polling issue with snmp_query after upgrading to 1.2.22

Templates, scripts for templates, scripts and requests for templates.

Moderators: Developers, Moderators

Post Reply
emiliosic
Posts: 12
Joined: Thu Apr 20, 2006 9:25 am
Location: Massachusetts, USA

Polling issue with snmp_query after upgrading to 1.2.22

Post by emiliosic »

Since upgrading from 1.2.21 to 1.2.22, we have a device that's being queried by SNMP but uses a custom snmp_query to build the data query.
The issue is that after adding a new graphic from this system, the poller stops collecting data for these but other graphics based on built-in templates continue to work.
We use spine to collect data.

The logs don't show any specific error but noticed this after adding a new graphic:

Code: Select all

2022-11-03 12:41:03 - SYSTEM STATS: Time:1.2859 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:604 RRDsProcessed:67
2022-11-03 12:40:03 - SYSTEM STATS: Time:1.2873 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:612 RRDsProcessed:69
2022-11-03 12:39:03 - SYSTEM STATS: Time:1.3175 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:609 RRDsProcessed:63
2022-11-03 12:38:03 - SYSTEM STATS: Time:1.3247 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:604 RRDsProcessed:65
2022-11-03 12:37:02 - SYSTEM STATS: Time:1.2959 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:604 RRDsProcessed:178
2022-11-03 12:36:02 - SYSTEM STATS: Time:1.2959 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:604 RRDsProcessed:179
2022-11-03 12:35:02 - SYSTEM STATS: Time:1.3167 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:612 RRDsProcessed:182
This is after a new graphic is added. RRDsProcessed is much lower.
After noticing this happened, I go clean the poller cache but this does not seem to fix this until reloading the associated data query.

We had the same snmp_query for years, and it seems that it broke between 1.2.21 and 1.2.22 - And 1.2.21 has a different issue with authentication that 1.2.22 addressed.

the snmp_query XML is like this:

Code: Select all

<query>
        <name>Get Session Agents</name>
        <description>List of Acme Packet Session Agents</description>
        <oid_index>.1.3.6.1.4.1.9148.3.2.1.2.1.1.2</oid_index>
        <oid_uptime>.1.3.6.1.2.1.1.3.0</oid_uptime>
        <index_order>apsaHName</index_order>
        <index_order_type>alphabetic</index_order_type>
        <index_title_format>|chosen_order_field|</index_title_format>

        <fields>
                <apsaIndex>
                        <name>ID</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>input</direction>
                        <oid>.1.3.6.1.4.1.9148.3.2.1.2.1.1.1</oid>
                </apsaIndex>
                <apsaHName>
                        <name>Host Name</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>input</direction>
                        <oid>.1.3.6.1.4.1.9148.3.2.1.2.1.1.2</oid>
                </apsaHName>
                <apsaStatus>
                        <name>Agent Status</name>
                        <method>walk</method>
                        <source>value</source>
                        <direction>output</direction>
                        <oid>.1.3.6.1.4.1.9148.3.2.1.2.1.1.22</oid>
                </apsaStatus>
 ...
Is this something known?

Thanks in advance
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: Polling issue with snmp_query after upgrading to 1.2.22

Post by TheWitness »

I'm not sure you've given enough data to actually properly diagnose this.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: Polling issue with snmp_query after upgrading to 1.2.22

Post by TheWitness »

Put the device into debug mode and see if Spine is properly collecting data. Use rrdtool info blah.rrd to get information on last updates, the value and if things exceeded a limit.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
emiliosic
Posts: 12
Joined: Thu Apr 20, 2006 9:25 am
Location: Massachusetts, USA

Re: Polling issue with snmp_query after upgrading to 1.2.22

Post by emiliosic »

Thanks.
The graphics are being updated as per retool info, with a timestamp for the current period, but it's recorded as NaN.

This is what the first lines of retool info looks like:

Code: Select all

rrd_version = "0003"
step = 300
last_update = 1667972042
header_size = 5216
ds[traffic_in].index = 0
ds[traffic_in].type = "COUNTER"
ds[traffic_in].minimal_heartbeat = 600
ds[traffic_in].min = 0.0000000000e+00
ds[traffic_in].max = 1.0000000000e+09
ds[traffic_in].last_ds = "79001135739"
ds[traffic_in].value = 1.4474413953e+05
ds[traffic_in].unknown_sec = 0
ds[traffic_out].index = 1
ds[traffic_out].type = "COUNTER"
ds[traffic_out].minimal_heartbeat = 600
ds[traffic_out].min = 0.0000000000e+00
ds[traffic_out].max = 1.0000000000e+09
ds[traffic_out].last_ds = "1168241960825"
ds[traffic_out].value = 1.0714650498e+05
ds[traffic_out].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].cur_row = 529
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[0].cdp_prep[1].value = NaN
rra[0].cdp_prep[1].unknown_datapoints = 0
spine is working, some graphics not based on these snmp queries are being updated.

Running the graph for the same RRD in debug mode:

Code: Select all

/usr/bin/rrdtool graph - \
--imgformat=PNG \
--start='-86400' \
--end='-60' \
--pango-markup  \
--title=' mia-sbc1 - Concurrent Sessions' \
--vertical-label='sessions' \
--slope-mode \
--base=1000 \
--height=120 \
--width=500 \
--tabwidth '30' \
--alt-autoscale-max \
--lower-limit='0' \
COMMENT:"From 2022-11-08 00\:48\:27 To 2022-11-09 00\:47\:27\c" \
COMMENT:"  \n" \
--border $rrdborder --slope-mode \
DEF:a='/usr/share/cacti/rra/XXXX_sessions_util_378.rrd':'sessions_util':AVERAGE \
LINE2:a#FF0000FF:'Concurrent Sessions'  \
GPRINT:a:LAST:'Current\:%8.0lf'  \
GPRINT:a:AVERAGE:'Average\:%8.0lf'  \
GPRINT:a:MAX:'Maximum\:%8.0lf' 
The troubleshooter does not indicate errors, but a spinning wheel on "Did the poller receive valid data?" and "Issues -> Waiting on analysis and RRDfile update".

Recent logs:

Code: Select all

2022-11-09 00:52:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0160
2022-11-09 00:52:03 - SYSTEM STATS: Time:1.2993 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:608 RRDsProcessed:65
2022-11-09 00:51:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0157
2022-11-09 00:51:03 - SYSTEM STATS: Time:1.2750 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:604 RRDsProcessed:66
2022-11-09 00:50:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0155
2022-11-09 00:50:03 - SYSTEM STATS: Time:1.2853 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:600 RRDsProcessed:67
2022-11-09 00:49:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0160
2022-11-09 00:49:03 - SYSTEM STATS: Time:1.3028 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:610 RRDsProcessed:69
2022-11-09 00:48:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0187
2022-11-09 00:48:03 - SYSTEM STATS: Time:1.2940 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:611 RRDsProcessed:63
2022-11-09 00:47:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0154
2022-11-09 00:47:03 - SYSTEM STATS: Time:1.2836 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:608 RRDsProcessed:65
2022-11-09 00:46:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0150
2022-11-09 00:46:03 - SYSTEM STATS: Time:1.2786 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:604 RRDsProcessed:66
2022-11-09 00:45:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0154
2022-11-09 00:45:03 - SYSTEM STATS: Time:1.2781 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:600 RRDsProcessed:67
2022-11-09 00:44:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0147
2022-11-09 00:44:03 - SYSTEM STATS: Time:1.2771 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:610 RRDsProcessed:69
2022-11-09 00:43:03 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0157
2022-11-09 00:43:03 - SYSTEM STATS: Time:1.2797 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:611 RRDsProcessed:63
2022-11-09 00:42:04 - SYSTEM DSDEBUG STATS: Type:poller, ChecksPerformed:1, TotalIssues:0, Time:0.0314
2022-11-09 00:42:04 - SYSTEM STATS: Time:2.2802 Method:spine Processes:1 Threads:4 Hosts:10 HostsPerProcess:10 DataSources:608 RRDsProcessed:65
I don't know how else to troubleshoot this.
User avatar
TheWitness
Developer
Posts: 17007
Joined: Tue May 14, 2002 5:08 pm
Location: MI, USA
Contact:

Re: Polling issue with snmp_query after upgrading to 1.2.22

Post by TheWitness »

Put the device into debug mode. You can edit the device, and you will see an option on the upper right corner to put the device into debug mode. When you put the device into debug mode, spine will log all the results. That way you can see if the snmp value is being returned by spine. If that's the case, I'm a bit stumped too. The RRDtool info output looks right to me.
True understanding begins only when we realize how little we truly understand...

Life is an adventure, let yours begin with Cacti!

Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages


For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest