Some graphs disappeared
Moderators: Developers, Moderators
Some graphs disappeared
Hello all,
I use cacti for a year or so. It is 0.8.7b. at 29th September I found, that all graphs from a host disappeared.
To analyse this I did the following:
- snmpwalk -c public -v 1 host
all informations went over my screen
- rrdtool dump /var/www/cacti/rra/host_traffic_in.rrd
there are no new entries since the 29th September
- compare the entries in Data Sources -> Host - Traffic and there is the right traffic_in.rrd file inserted.
So my question is where to go on further for investigaion? Because the data seems to come in, but seems not to be inserted in the rrd file.
TIA
Muffinman
I use cacti for a year or so. It is 0.8.7b. at 29th September I found, that all graphs from a host disappeared.
To analyse this I did the following:
- snmpwalk -c public -v 1 host
all informations went over my screen
- rrdtool dump /var/www/cacti/rra/host_traffic_in.rrd
there are no new entries since the 29th September
- compare the entries in Data Sources -> Host - Traffic and there is the right traffic_in.rrd file inserted.
So my question is where to go on further for investigaion? Because the data seems to come in, but seems not to be inserted in the rrd file.
TIA
Muffinman
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Are there poller cache entries for the graphs? What happens when you repopulate the poller cache?
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Yes, here they are:TheWitness wrote:Are there poller cache entries for the graphs?
server011 - Advanced Ping Script Server: /var/www/cacti/scripts/ss_fping.php ss_fping 20 UDP 80
RRD: /var/www/cacti/rra/server011_loss_6523.rrd
server011 - CPU Utilization - CPU0 Script Server: /var/www/cacti/scripts/ss_host_cpu.php ss_host_cpu server011.mydomain.com 256 2:161:5000:public:::MD5::DES: get usage 0
RRD: /var/www/cacti/rra/server011_cpu_6527.rrd
server011 - Free Space - /dev/cciss/c0d0 Script: perl /var/www/cacti/scripts/query_unix_partitions.pl get available /dev/cciss/c0d0p1
RRD: /var/www/cacti/rra/server011_hdd_free_6531.rrd
server011 - Free Space - /dev/cciss/c0d0 Script: perl /var/www/cacti/scripts/query_unix_partitions.pl get used /dev/cciss/c0d0p1
RRD: /var/www/cacti/rra/server011_hdd_free_6531.rrd
server011 - Free Space - /dev/cciss/c0d0 Script: perl /var/www/cacti/scripts/query_unix_partitions.pl get used /dev/cciss/c0d0p2
RRD: /var/www/cacti/rra/server011_hdd_free_7073.rrd
server011 - Free Space - /dev/cciss/c0d0 Script: perl /var/www/cacti/scripts/query_unix_partitions.pl get available /dev/cciss/c0d0p2
RRD: /var/www/cacti/rra/server011_hdd_free_7073.rrd
server011 - Free Space - |query_dskDevice| Script: perl /var/www/cacti/scripts/query_unix_partitions.pl get available /dev/cciss/c0d0p3
RRD: /var/www/cacti/rra/server011_hdd_free_6532.rrd
server011 - Free Space - |query_dskDevice| Script: perl /var/www/cacti/scripts/query_unix_partitions.pl get used /dev/cciss/c0d0p3
RRD: /var/www/cacti/rra/server011_hdd_free_6532.rrd
server011 - Load Average Script: perl /var/www/cacti/scripts/loadavg_multi.pl
RRD: /var/www/cacti/rra/server011_load_1min_6524.rrd
server011 - Logged in Users Script: perl /var/www/cacti/scripts/unix_users.pl
RRD: /var/www/cacti/rra/server011_users_6525.rrd
server011 - Processes Script: perl /var/www/cacti/scripts/unix_processes.pl
RRD: /var/www/cacti/rra/server011_proc_6526.rrd
server011 - Traffic - 10.133.253.71 - eth1 SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.2.2.1.10.3
RRD: /var/www/cacti/rra/server011_traffic_in_6529.rrd
server011 - Traffic - 10.133.253.71 - eth1 SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.2.2.1.16.3
RRD: /var/www/cacti/rra/server011_traffic_in_6529.rrd
server011 - Traffic - 10.133.253.97 - tap0 SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.2.2.1.10.5
RRD: /var/www/cacti/rra/server011_traffic_in_6530.rrd
server011 - Traffic - 10.133.253.97 - tap0 SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.2.2.1.16.5
RRD: /var/www/cacti/rra/server011_traffic_in_6530.rrd
server011 - Traffic - 192.168.1.145 - eth0 SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.2.2.1.10.2
RRD: /var/www/cacti/rra/server011_traffic_in_6528.rrd
server011 - Traffic - 192.168.1.145 - eth0 SNMP Version: 2, Community: public, OID: .1.3.6.1.2.1.2.2.1.16.2
RRD: /var/www/cacti/rra/server011_traffic_in_6528.rrd
If you mean "Rebuild Poller Cache", this has already happened. Because it helped someone in the forum before. Unfortunately it did not help me.TheWitness wrote:What happens when you repopulate the poller cache?
Thanks for replying.
- TheWitness
- Developer
- Posts: 17007
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
It could have broken if the time on your server moved ahead. The way we see if this is the case by running "php -q poller.php --force" from the command line and look for the "Ok" result messages.
TheWitness
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
I ran the "php -q poller.php --force".
The output said something about memory which was not big enough. So I edited /etc/php.ini and changed the entry regarding memory for scrips from 8M to 32M.
Then I ran the command again and the output is like this (the graphs didn't come back):
I'm a bit worried about the partial results, but that is not my point. The point seems to be the time Spine needs to work with the snmpwalk results.
Here are my entries in the poller paragraph:
The output said something about memory which was not big enough. So I edited /etc/php.ini and changed the entry regarding memory for scrips from 8M to 32M.
Then I ran the command again and the output is like this (the graphs didn't come back):
Code: Select all
OK u:0.03 s:0.06 r:13.25
10/08/2008 09:05:14 AM - SPINE: Poller[0] Host[42] DS[732] SS[1] WARNING: Result from SERVER not valid. Partial Result: ...
10/08/2008 09:05:14 AM - SPINE: Poller[0] Host[42] DS[733] SS[0] WARNING: Result from SERVER not valid. Partial Result: ...
10/08/2008 09:05:14 AM - SPINE: Poller[0] Host[42] DS[734] WARNING: Result from SNMP not valid. Partial Result: ...
10/08/2008 09:05:14 AM - SPINE: Poller[0] Host[42] DS[734] WARNING: Result from SNMP not valid. Partial Result: ...
OK u:0.03 s:0.06 r:14.28
OK u:0.03 s:0.06 r:14.28
(and so on and so forth)
OK u:0.08 s:0.14 r:45.17
OK u:0.08 s:0.14 r:45.17
10/08/2008 09:05:58 AM - SPINE: Poller[0] Host[78] DS[1051] WARNING: Result from SNMP not valid. Partial Result: ...
10/08/2008 09:05:58 AM - SPINE: Poller[0] Host[78] DS[1051] WARNING: Result from SNMP not valid. Partial Result: ...
10/08/2008 09:05:58 AM - SPINE: Poller[0] Host[57] DS[911] SS[5] WARNING: Result from SERVER not valid. Partial Result: ...
10/08/2008 09:05:58 AM - SPINE: Poller[0] Host[57] DS[912] WARNING: Result from SNMP not valid. Partial Result: ...
10/08/2008 09:05:58 AM - SPINE: Poller[0] Host[57] DS[912] WARNING: Result from SNMP not valid. Partial Result: ...
OK u:0.08 s:0.14 r:58.26
OK u:0.08 s:0.14 r:58.26
(and so on and so forth)
OK u:0.09 s:0.15 r:59.28
10/08/2008 09:09:54 AM - SPINE: Poller[0] ERROR: Spine Timed Out While Processing Hosts Internal
10/08/2008 09:09:54 AM - SYSTEM STATS: Time:294.6504 Method:spine Processes:2 Threads:25 Hosts:234 HostsPerProcess:117 DataSources:6210 RRDsProcessed:1957
Here are my entries in the poller paragraph:
- GENERAL
Enabled is checked.
Poller Type Spine
Poller Interval Every 5 Minutes
Cron Interval Every 5 Minutes
max concurrent Poller Processes 2
SPINE SPECIFIC EXECUTION PARAMETERS
Max threads per process 25
number of PHP Scritp Servers 10
Script and Script Server Timeout Value 500
Max SNMP OID_s per get request 10
HOST AVAILABILITY SETTINGS
Downed Host Detection Ping and SNMP
Ping Type UDP Ping
Ping Port 23
Pint Timeout Value 5000
Pint Retry Count 5
HOST UP/DOWN SETTINGS
Failure Count 2
Recovery Count 3
Hello,
here I will document what I did (each and every step):
Thank you for your help and have a nice weekend!
Best greetings from Hamburg.
here I will document what I did (each and every step):
- 1. Check Cacti Log File
Yes, there are SNMP timeouts detected. BUT: They come not always from the same host, they come never from the host, I have now troubles with, and they come not regularly. (I assume that they come, when this very host is too busy to answer.)
For example:
2. Check Basic Data GatheringCode: Select all
grep "WARNING: SNMP timeout detected" /var/www/cacti/log/cacti.log.old 10/05/2008 02:36:18 AM - SPINE: Poller[0] Host[6] DS[59] WARNING: SNMP timeout detected [500 ms], ignoring host 'host1.mydomain' 10/05/2008 03:55:17 AM - SPINE: Poller[0] Host[62] DS[945] WARNING: SNMP timeout detected [500 ms], ignoring host 'host2.mydomain' 10/05/2008 03:55:17 AM - SPINE: Poller[0] Host[62] DS[945] WARNING: SNMP timeout detected [500 ms], ignoring host 'host2.mydomain' 10/05/2008 03:55:46 AM - SPINE: Poller[0] Host[109] DS[1507] WARNING: SNMP timeout detected [5000 ms], ignoring host 'host3.mydomain'
I have no special or own scripts. So I tested only with snmpwalk and snmpget and did get correct answers.
For example:
3. Check cacti's pollerCode: Select all
snmpget -c community-string -v2c trouble-making-host.mydomain .1.3.6.1.2.1.2.2.1.16.5 IF-MIB::ifOutOctets.5 = Counter32: 945930899 snmpwalk -c community-string -v2c SNMPv2-MIB::sysDescr.0 = STRING: Linux trouble-making-host 2.4.34-grsec #1 SMP Wed Feb 28 20:51:52 EST 2007 i686 SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10 DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (905373152) 104 days, 18:55:31.52 ...
I think that here is the crucial point.
I did the
(because 256 is the ID of the host I look at). The output is fine (see below)! And the best thing: I can create correct output for the graphs (= when I run spine by hand for my trouble host, I do create strokes for the time in the graphs [and of course in the rrd]).Code: Select all
/usr/bin/spine --verbosity=5 256 256
I attached the debug output from spine to this posting.
4. Check MySQL updating
I skipped this part, as you wrote and did the thing with rrd file
updating.
5. Check rrd file updating
Another interesting point is, that there are "rrdtool update --template"
lines in the debug logfile. But none of those lines contain the
trouble-making-host.mydomain. ;-(
6. Check rrd file ownership
The ownership of all rrd files is the same and is everywhere cacti for
user and cacti for group and the cacti user is the one I used for the tests of
spine, snmpwalk and so on.
7. Check rrd file numbers
I did check the numers in the rrd file and they are like this:
So no problem with minimum and maximum, eh?Code: Select all
ds[cpu].type = "GAUGE" ds[cpu].minimal_heartbeat = 600 ds[cpu].min = 0.0000000000e+00 ds[cpu].max = 1.0000000000e+02 ds[cpu].last_ds = "5" ds[cpu].value = NaN rra[0].cf = "AVERAGE" rra[0].rows = 600 rra[0].pdp_per_row = 1 rra[0].xff = 5.0000000000e-01 rra[0].cdp_prep[0].value = NaN rra[0].cdp_prep[0].unknown_datapoints = 0 rra[1].cf = "AVERAGE" rra[1].rows = 700 rra[1].pdp_per_row = 6 rra[1].xff = 5.0000000000e-01 rra[1].cdp_prep[0].value = NaN rra[1].cdp_prep[0].unknown_datapoints = 3
8. Check rrdtool graph statement
No problem with creating graphs because, it displays just what is in the
rrd file: NaN.
Thank you for your help and have a nice weekend!
Best greetings from Hamburg.
- Attachments
-
- cacti-debug-output.txt
- The cacti debug output is attached here...
- (19.72 KiB) Downloaded 88 times
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Thanks for posting this. Indeed, polling seems to be an issue. But from spine output, I see the following
- you seem to use scripts dedicated to localhost only for polling Host[256]. Is this the cacti localhost? If not, please use ucd/net templates instead
- many data is returned, but some not. Please again indicate which one is troiubling you (perhaps I did skip your statement concerning this)
Reinhard
- you seem to use scripts dedicated to localhost only for polling Host[256]. Is this the cacti localhost? If not, please use ucd/net templates instead
- many data is returned, but some not. Please again indicate which one is troiubling you (perhaps I did skip your statement concerning this)
Reinhard
Good morning!
256". Why some data is not returned is my big question.
Thanks for listening!
Greetings from Hamburg
No, the Host[256] is not localhost. This is the troublemaking host. Localhost has ID 1.gandalf wrote: - you seem to use scripts dedicated to localhost only for polling Host[256]. Is this the cacti localhost?
This I do not understand completely: I shall exchange some templates I use by ucd/net templates? But for some topics there are no ucd/net equivalents. ;-(If not, please use ucd/net templates instead.
The poller log I showed (cacti-debug-output.txt in my last posting), was only from the troublemaking client, because I did "/usr/bin/spine --verbosity=5 256many data is returned, but some not. Please again indicate which one is troiubling you (perhaps I did skip your statement concerning this)
256". Why some data is not returned is my big question.
Thanks for listening!
Greetings from Hamburg
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Those "Localhost" Templates fetch the data from the local host, not Host[256]. You will not want this. For many functions, there are ucd/net replacements. If not everything is covered, please search Scripts and Templates Forum for replacements.Muffinman wrote:This I do not understand completely: I shall exchange some templates I use by ucd/net templates? But for some topics there are no ucd/net equivalents. ;-(
Reinhard
Who is online
Users browsing this forum: No registered users and 1 guest