Cacti graphs not populating on some poller cycles

Post support questions that relate to the Windows 2003/2000/XP operating systems.

Moderators: Developers, Moderators

Post Reply
mudmud
Posts: 21
Joined: Sun Oct 28, 2012 2:56 pm

Cacti graphs not populating on some poller cycles

Post by mudmud »

Hi all,

I am running a Cacti server of which the specs are mentioned below. Recently, after experiencing too much CPU utilization on my CPU I started using SPINE. But now 'some' of my graphs are not being populated on each polling cycle.( refer to the attached image)

My configs are as shown below. Please let me know what I should change to get a smooth curve in my graphs.

Thanks in advance,

D.


Configs

General Information
Date Fri, 16 Nov 2012 08:51:06 +0400
Cacti Version 0.8.8a
Cacti OS win32
SNMP Version NET-SNMP version: 5.6.1.1
RRDTool Version RRDTool 1.4.x
Hosts 75
Graphs 1708
Data Sources Script/Command: 45
SNMP: 334
SNMP Query: 788
Script Query: 626
Script - Script Server (PHP): 15
Script Query - Script Server: 1
Total: 1809

Poller Information
Interval 300
Type SPINE 0.8.8a Copyright 2002-2012 by The Cacti Group
Items Action[0]: 1659
Action[1]: 911
Action[2]: 14
Total: 2584
Concurrent Processes 1
Max Threads 4
PHP Servers 1
Script Timeout 25
Max OID 10
Last Run Statistics Time:78.5150 Method:spine Processes:1 Threads:4 Hosts:53 HostsPerProcess:53 DataSources:2584 RRDsProcessed:1224
PHP Information
PHP Version 5.3.10
PHP OS WINNT
PHP uname Windows NT USER-PC 6.1 build 7601 (Windows 7 Business Edition Service Pack 1) i586
PHP SNMP Installed
max_execution_time 30
memory_limit 128M
Attachments
nogoodgraph.jpg
nogoodgraph.jpg (54.94 KiB) Viewed 5020 times
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: Cacti graphs not populating on some poller cycles

Post by BSOD2600 »

did you follow the debugging guide to further troubleshoot this issue yet?
mudmud
Posts: 21
Joined: Sun Oct 28, 2012 2:56 pm

Re: Cacti graphs not populating on some poller cycles

Post by mudmud »

Hi BSOD,

Thanks for your reply.

Exactly which guide are you talking about? I've followed a few but still couldn't get it fixed. Looking at my cacti logs yesterday I found out that I'm getting the following error for many devices :

11/17/2012 04:46:17 PM - SPINE: Poller[0] Host[49] ERROR: Empty result [aaa.bbb.ccc.ddd]: 'C:\php\php.exe -q C:\Apache2\htdocs\cacti\scripts\mikrotik_wireless_interfaces.php blahblah aaa.bbb.ccc.ddd get ifInSignal 00:00:00:00:00:00'

But when I run this script in the command line I get the result. I am pretty sure this is the root cause of my problem. Do you have any idea how I can resolve it?

Thanks.

D.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: Cacti graphs not populating on some poller cycles

Post by BSOD2600 »

Ah, so it's only the mikrotik graphs which are breaking? Does that mikrotik_wireless_interfaces.php have an snmp timeout value which possibly should get increased?

Looks like you already found the debug guide - http://docs.cacti.net/manual:088:4_help.2_debugging
mudmud
Posts: 21
Joined: Sun Oct 28, 2012 2:56 pm

Re: Cacti graphs not populating on some poller cycles

Post by mudmud »

Hi BSOD,

Yes I did go through that debugging guide.

I went through the mikrotik_wireless_interfaces.php for any snmp time out value but I couldn't find anything related to it. I attached that file for your referrence please take a look at it yourself.

I was using the spine command line with verbosity=8, and I got the following results for a random device I am trying to monitor:

Code: Select all

11/20/2012 10:35:51 PM - SPINE: Poller[0] DEBUG: In Poller, About to Start Polling of Host
11/20/2012 10:35:51 PM - SPINE: Poller[0] DEBUG: The Value of Active Threads is1
11/20/2012 10:35:51 PM - SPINE: Poller[0] DEVDBG: SQL:'SELECT id, hostname, snmp_community, snmp_version, snmp_username, snmp_password, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context, snmp_port, snmp_timeout, max_oids, availability_method, ping_method, ping_port, ping_timeout, ping_retries,status, status_event_count, status_fail_date, status_rec_date, status_last_error, min_time, max_time, cur_time, avg_time, total_polls, failed_polls, availability  FROM host WHERE id=116'
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] DEBUG: Entering TCP Ping 11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] DEBUG: TCP Host Alive, Try Count:1, Time:15.9998 ms
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] DEBUG: Entering SNMP Ping 11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] PING Result: TCP: Host is Alive
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] SNMP Result: Host responded to SNMP
11/20/2012 10:35:51 PM - SPINE: Poller[0] DEVDBG: SQL:'UPDATE host SET status='3', status_event_count='0', status_fail_date='0000-00-00 00:00:00', status_rec_date='0000-00-00 00:00:00', status_last_error='', min_time='0.000000', max_time='116.999990', cur_time='15.499945', avg_time='10.407015', total_polls='3174', failed_polls='0', availability='100.0000' WHERE id='116''
11/20/2012 10:35:51 PM - SPINE: Poller[0] DEVDBG: SQL:'SELECT data_query_id, action, op, assert_value, arg1 FROM poller_reindex WHERE host_id=116'
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] TH[1] Host has no information for recache.
11/20/2012 10:35:51 PM - SPINE: Poller[0] DEVDBG: SQL:'SELECT snmp_port, count(snmp_port) FROM poller_item WHERE host_id=116 AND rrd_next_step < 0 GROUP BY snmp_port '
11/20/2012 10:35:51 PM - SPINE: Poller[0] DEVDBG: SQL:'SELECT action, hostname, snmp_community, snmp_version, snmp_username, snmp_password, rrd_name, rrd_path, arg1, arg2, arg3, local_data_id, rrd_num, snmp_port, snmp_timeout, snmp_auth_protocol, snmp_priv_passphrase, snmp_priv_protocol, snmp_context  FROM poller_item WHERE host_id=116 and rrd_next_step <=0 ORDER by snmp_port '
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] TH[1] NOTE: There are '5' Polling Items for this Host
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] TH[1] DS[12092] WARNING: SNMP timeout detected [1000 ms], ignoring host '10.0.121.29'
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] TH[1] DS[12092] SNMP: v1: 10.0.121.29, dsname: traffic_out, oid: .1.3.6.1.2.1.2.2.1.16.7, value: U
11/20/2012 10:35:51 PM - SPINE: Poller[0] Host[116] TH[1] DS[12092] WARNING: SNMP timeout detected [1000 ms], ignoring host '10.0.121.29'
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12092] SNMP: v1: 10.0.121.29, dsname: traffic_in, oid: .1.3.6.1.2.1.2.2.1.10.7, value: U
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12085] WARNING: SNMP timeout detected [1000 ms], ignoring host '10.0.121.29'
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12085] SNMP: v1: 10.0.121.29, dsname: signal_in, oid: .1.3.6.1.4.1.14988.1.1.1.1.1.4.9, value: U
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12082] WARNING: SNMP timeout detected [1000 ms], ignoring host '10.0.121.29'
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12082] SNMP: v1: 10.0.121.29, dsname: signal_in, oid: .1.3.6.1.4.1.14988.1.1.1.1.1.4.6, value: U
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12077] WARNING: SNMP timeout detected [1000 ms], ignoring host '10.0.121.29'
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] DS[12077] SNMP: v1: 10.0.121.29, dsname: signal_in, oid: .1.3.6.1.4.1.14988.1.1.1.1.1.4.1, value: U
11/20/2012 10:35:52 PM - SPINE: Poller[0] DEVDBG: SQL:'INSERT INTO poller_output(local_data_id, rrd_name, time, output) VALUES (12092,'traffic_out','2012-11-20 22:35:51','U'),(12092,'traffic_in','2012-11-20 22:35:51','U'),(12085,'signal_in','2012-11-20 22:35:51','U'),(12082,'signal_in','2012-11-20 22:35:51','U'),(12077,'signal_in','2012-11-20 22:35:51','U') ON DUPLICATE KEY UPDATE output=VALUES(output)'
11/20/2012 10:35:52 PM - SPINE: Poller[0] DEVDBG: SQL:'UPDATE poller_item SET rrd_next_step=IF((rrd_next_step-60)>=0, (rrd_next_step-60), (rrd_step-60)) WHERE host_id=116'
11/20/2012 10:35:52 PM - SPINE: Poller[0] Host[116] TH[1] Total Time:  0.12 Seconds
From the above output we can see that the poller_output table is filled with 'U' values. What I dont get is, how it is working at one time but not the other. (thus giving gaps in the graph). Could you tell me which table in the cacti database holds information about the data collected during a polling cycle?

Another thing is for the above device I've set the snmp time out value to 1000ms. Please correct me if I'm wrong but the above device was polled in 0.12s right? so that means the poller did not wait 1000ms to time out that host, is that correct? If this is the case, what other methods are available for me to change the snmp time out value.

Cacti is driving me crazy !! :-?

D
Attachments
mikrotik_wireless_interfaces.txt
(5.14 KiB) Downloaded 140 times
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: Cacti graphs not populating on some poller cycles

Post by BSOD2600 »

mudmud wrote:I went through the mikrotik_wireless_interfaces.php for any snmp time out value but I couldn't find anything related to it.

Not being familiar with that script, it was more of a question of the hosts snmp timeout value is passed into it. I'll assume yes.
mudmud wrote:Another thing is for the above device I've set the snmp time out value to 1000ms.
No, it appears the device is taking longer than 1000ms to respond to the snmp query from Spine, which is timing out and thus returning no data. Tried increasing the snmp timeout to 3-5 seconds? sure there isnt any anti-dos protection getting triggered?
mudmud wrote:Please correct me if I'm wrong but the above device was polled in 0.12s right? so that means the poller did not wait 1000ms to time out that host, is that correct?
No, it means thread 1 took 0.12s to perform all of those polling operations against Host[116].
mudmud
Posts: 21
Joined: Sun Oct 28, 2012 2:56 pm

Re: Cacti graphs not populating on some poller cycles

Post by mudmud »

mudmud wrote:
Please correct me if I'm wrong but the above device was polled in 0.12s right? so that means the poller did not wait 1000ms to time out that host, is that correct?
BSOD2600 wrote:
No, it means thread 1 took 0.12s to perform all of those polling operations against Host[116].
So the time it 'waits' for the snmp to time out is not counted as part of the polling against the same host?

BSOD, do you have any explaination for :
From the above output we can see that the poller_output table is filled with 'U' values. What I dont get is, how it is working at one time but not the other. (thus giving gaps in the graph). Could you tell me which table in the cacti database holds information about the data collected during a polling cycle?
Another thing, I increased the snmp time out value to 5 seconds, but still it is getting timed out. There is a firewall that I am going through but I have already added exceptions to allow SNMP traffic. Do you think that it may be somehow dropping the packets? Because I am sure the device is within reach as SNMP community is configured there and ICMP ping takes less than 3ms.

Thanks in advance for your help.
User avatar
BSOD2600
Cacti Moderator
Posts: 12171
Joined: Sat May 08, 2004 12:44 pm
Location: USA

Re: Cacti graphs not populating on some poller cycles

Post by BSOD2600 »

mudmud wrote:So the time it 'waits' for the snmp to time out is not counted as part of the polling against the same host?
I'm not really sure. Spine could be paralleling the requests so overall it only took 1200ms.
mudmud wrote: BSOD, do you have any explaination for :
From the above output we can see that the poller_output table is filled with 'U' values. What I dont get is, how it is working at one time but not the other. (thus giving gaps in the graph). Could you tell me which table in the cacti database holds information about the data collected during a polling cycle?
http://docs.cacti.net/manual:088:99_reference.db_design look in the poller_cache.
mudmud wrote:Another thing, I increased the snmp time out value to 5 seconds, but still it is getting timed out. There is a firewall that I am going through but I have already added exceptions to allow SNMP traffic. Do you think that it may be somehow dropping the packets? Because I am sure the device is within reach as SNMP community is configured there and ICMP ping takes less than 3ms.
Could run wireshark on the cacti sever and/or packet capture on the firewall to really check what is going on with the SNMP traffic. If you're only having this problem with a specific class of device, I'd be inclined to believe it's the device and not Cacti.
mudmud
Posts: 21
Joined: Sun Oct 28, 2012 2:56 pm

Re: Cacti graphs not populating on some poller cycles

Post by mudmud »

Thanks BSOD...I will monitor the traffic from the cacti server and get back to the forum...
Post Reply

Who is online

Users browsing this forum: No registered users and 2 guests