linux bandwidth monitoring

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

linux bandwidth monitoring

Post by allenelson »

Hello. I am pretty new to snmp / monitoring, however I've been reading everything I can find in my spare time. I'm having quite a bit of trouble trying to get the bandwidth to log in Cacti correctly. Everything else has been successful: memory, cpu, load, disks, users..

I've noticed something funny in one of the debugs.

Data Query Debug Information

+ Running data query [1].
+ Found type = '3' [snmp query].
+ Found data query XML file at '/var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/resource/snmp_queries/interface.xml'
+ XML file parsed ok.
+ Executing SNMP walk for list of indexes @ '.1.3.6.1.2.1.2.2.1.1'
+ No SNMP data returned
+ Found data query XML file at '/var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/resource/snmp_queries/interface.xml'
+ Found data query XML file at '/var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/resource/snmp_queries/interface.xml'
+ Found data query XML file at '/var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/resource/snmp_queries/interface.xml'
+ Found data query XML file at '/var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/resource/snmp_queries/interface.xml'
+ Found data query XML file at '/var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/resource/snmp_queries/interface.xml'


And this is the output of snmpwalk:

[root@ipfonehome www]# snmpwalk -v 1 -c public 66.109.25.108 .1.3.6.1.2.1.2.2.1.1
IF-MIB::ifIndex.1 = INTEGER: 1
IF-MIB::ifIndex.2 = INTEGER: 2
IF-MIB::ifIndex.3 = INTEGER: 3

I've also noticed that the community is doing funny things. If i set the community name to public in the cacti config for the device, nothing will poll. Nothing polls but it doesn't give me the read SNMP ERROR message. It displays the address, system info and uptime correctly. If I leave it blank, everything works, minus the bandwidth, but it gives me the red SNMP ERROR message. The snmpd.conf has a basic config:

rocommunity public
includeAllDisks /dev/md
proc asterisk
disk /
disk /boot

I've been trying to add the interface to the file manually but cannot figure out how. Everything I enter errors when i restart snmpd. Could someone help me out with the syntax?

interface eth0 ethernetCsmacd 10000000

Is what I thought was correct.. Any help would be greatly appreciated. Or if there is another howto I missed in my searching that someone could point me to, I'd also really appreciate that. :)

PS. Are there any snmpd.conf manuals in english? I'd rather manually enter MIBS into it but am having a rough start. Thanks in advance everyone!
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Do you have php-snmp installed?
Reinhard
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

yes, php-snmp and the latest versions of everything
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

for whatever it's worth, i log the interface with successfully with mrtg but the setup is a bit different. so i do know the daemon is working correctly and polling.
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

bump
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

What distro are you using? cacti installed from distro, rpm or source? Your error is very strange, so we have to dig a bit deeper ...
Reinhard
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

Ive installed everything from source. And yeah it's a weird error, I use mrtg for the bandwidth with no problems at all. These are the versions of everything:

mysql 4.1.20
net-snmp 5.1.2
php-snmp-4.3.9
rrdtool 1.2.18
cacti 0.8.6j

tried with a minimal snmpd.conf

rocommunity public

the mrtg bandwidth as follows

PageTop[dns1.ipfonehome.com]: <H1>Traffic Analysis for dns1.ipfonehome.com</H1>
PNGTitle[dns1.ipfonehome.com]: dns1.ipfonehome.com traffic
Target[dns1.ipfonehome.com]: 2:public@66.109.25.108
Title[dns1.ipfonehome.com]: dns1 statistics
Maxbytes[dns1.ipfonehome.com]: 1250000
WithPeak[dns1.ipfonehome.com]: wmyd
YLegend[dns1.ipfonehome.com]: Bits per Second
ShortLegend[dns1.ipfonehome.com]: b/s
Legend1[dns1.ipfonehome.com]: Incoming Traffic in Bits per Second
Legend2[dns1.ipfonehome.com]: Outgoing Traffic in Bits per Second
Legend3[dns1.ipfonehome.com]: Maximal 5 Minute Incoming Traffic
Legend4[dns1.ipfonehome.com]: Maximal 5 Minute Outgoing Traffic
LegendI[dns1.ipfonehome.com]: &nbsp;Inbound:
LegendO[dns1.ipfonehome.com]: &nbsp;Outbound:

nothing in the cacti logs..
no errors in the snmp logs..
no errors in the web logs..

A few things are weird. I don't have any permission issues either, however, I have to manually create graphs by pasting the rrdtool create code into the terminal. I don't know why but it won't create them. Even after running the polling script, and checking again, they're still not there. I can live with this though.

And like i said before, using the public community tag in the device config it wont poll, however it shows the uptime. If i leave it blank, it gives the SNMP error but it will poll correctly.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

allenelson wrote:And like i said before, using the public community tag in the device config it wont poll, however it shows the uptime. If i leave it blank, it gives the SNMP error but it will poll correctly.
This one is the bit we should takle first. Never heard of it before. Would it be possible to peform the following:
- start ethereal/wireshark or tcpdump if you like
- enter the valid snmp community string to your host and save
- stop the trace
- post your findings (if you are not familiar with traces, please post it or pm)
Reinhard
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

not a problem. i'll trace it tomorrow though as i'm off to bed right now. i added a Generic OID graph to the device, and entered in the exact OID for the interface (.1.3.6.1.2.1.2.2.1.3.2).

[root@ipfonehome cfg]# snmpwalk -v 1 -c public localhost .1.3.6.1.2.1.2.2.1.3.2

IF-MIB::ifType.2 = INTEGER: ethernetCsmacd(6)

let's see if it works magically while i'm asleep.. you do have to enter the (.) decimal before the OID in cacti, correct?

i'll post the findings on the trace tomorrow. thanks for the help.
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

actually strike that, im really tired and didnt realize what i was typing:

1.3.6.1.2.1.2.2.1.10.1

[root@ipfonehome log]# snmpget -c public -v1 localhost 1.3.6.1.2.1.2.2.1.10.1
IF-MIB::ifInOctets.1 = Counter32: 7737309

i think that should be ingressed traffic
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

ok i've verified with the community set to public, it does send/receive the data

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
20:56:39.810024 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(28) system.sysDescr.0
20:56:39.810614 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) system.sysDescr.0="Linux ipfon"
20:56:39.812374 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(28) system.sysUpTime.0
20:56:39.812532 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(31) system.sysUpTime.0=7962605
20:56:39.812981 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(28) system.sysName.0
20:56:39.813098 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) system.sysName.0="ipfonehome."
20:56:39.813514 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(28) system.sysLocation.0
20:56:39.813623 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) system.sysLocation.0="Englewood C"
20:56:39.814069 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(28) system.sysContact.0
20:56:39.814178 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) system.sysContact.0="admin@ipfon"

it sends/receives the snmp data for the systeminfo

however when i run the poller, it just hangs

when i remove public from the community, i see the traffic and it is polling / graphing properly.

[root@ipfonehome ~]# date
Fri Feb 2 21:00:09 EST 2007

02/02/2007 09:00:03 PM - SYSTEM STATS: Time:2.0933 Method:cmd.php Processes:1 Threads:N/A Hosts:2 HostsPerProcess:2 DataSources:9 RRDsProcessed:7
02/02/2007 09:00:12 PM - SYSTEM STATS: Time:1.0771 Method:cmd.php Processes:1 Threads:N/A Hosts:2 HostsPerProcess:2 DataSources:9 RRDsProcessed:7
OK u:0.01 s:0.00 r:0.00
OK u:0.01 s:0.00 r:0.00
OK u:0.01 s:0.00 r:0.00
OK u:0.01 s:0.00 r:0.00
OK u:0.01 s:0.00 r:0.00
OK u:0.01 s:0.00 r:0.00
OK u:0.01 s:0.00 r:0.00

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 96 bytes
21:00:02.633359 IP one.32996 > one.snmp: GetRequest(39) interfaces.ifTable.ifEntry.ifInOctets.2 interfaces[|snmp]
21:00:02.635284 IP one.snmp > one.32996: GetResponse(39) interfaces.ifTable.ifEntry.ifInOctets.2=3016263505 .iso.org=[|snmp]
21:00:02.646659 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(39) E:2021.11.50.0 E:[|snmp]
21:00:02.647345 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) E:2021.11.50.0=4081578 .iso.org.dod=[|snmp]
21:00:02.657574 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(39) E:2021.4.6.0 E:[|snmp]
21:00:02.657996 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) E:2021.4.6.0=17792 .iso.org.dod.internet=[|snmp]
21:00:02.783829 IP ipfonehome.com.32996 > ipfonehome.com.snmp: GetRequest(39) system.sysUpTime.0 system.sysName[|snmp]
21:00:02.784084 IP ipfonehome.com.snmp > ipfonehome.com.32996: GetResponse(39) system.sysUpTime.0=7982902 .iso.org.dod.internet.mgmt=[|snmp]

now im not sure why the hostname is 'one'. maybe a stale MYSQL record. but it looks like it did poll properly, response wise.

after seeing the packets now i'm really at a loss because the graph is still empty.
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

now ive noticed when i execute the poller repetetivly, i dont see anymore packets flying in..
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

ok i didnt want to remember i set that name in the hosts file..

so i setup a new graph on a different host, and saw the responses from it

21:20:37.687011 IP one.33004 > one.snmp: GetNextRequest(29) ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex
21:20:37.687588 IP one.snmp > one.33004: GetResponse(34) ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.66.109.25.108=2
21:20:37.687730 IP one.33004 > one.snmp: GetNextRequest(33) ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.66.109.25.108
21:20:37.688166 IP one.snmp > one.33004: GetResponse(34) ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1=1
21:20:37.688290 IP one.33004 > one.snmp: GetNextRequest(33) ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1
21:20:37.688925 IP one.snmp > one.33004: GetResponse(37) ip.ipAddrTable.ipAddrEntry.ipAdEntNetMask.66.109.25.108=255.255.255.248

etc etc, and i have the community public set. it displays the uptime instead of an error but when in the devices section the status is unknown. im going to see if this one graphs tonight.. since everything looks like it's working correctly. however i dont see any packets fly when i run the poller script with the community on or off. it just hangs when its turned off but we already know that.. i even opened up tcp port in the snmpd.conf file to see if that has any effect. OH! speak of the devil. the crontab just ran.

21:25:02.137616 IP one.33004 > one.snmp: GetRequest(39) interfaces.ifTable.ifEntry.ifInOctets.2 [|snmp]
21:25:02.138726 IP one.snmp > one.33004: GetResponse(39) interfaces.ifTable.ifEntry.ifInOctets.2=3060826298 =[|snmp]
21:25:02.142326 IP ipfonehome.com.33004 > ipfonehome.com.snmp: GetRequest(39) E:2021.11.50.0 E:[|snmp]
21:25:02.142859 IP ipfonehome.com.snmp > ipfonehome.com.33004: GetResponse(39) E:2021.11.50.0=4126573 .iso.org.dod=[|snmp]
21:25:02.146478 IP ipfonehome.com.33004 > ipfonehome.com.snmp: GetRequest(39) E:2021.4.6.0 E:[|snmp]
21:25:02.147016 IP ipfonehome.com.snmp > ipfonehome.com.33004: GetResponse(39) E:2021.4.6.0=30696 .iso.org.dod.internet=[|snmp]
21:25:02.188637 IP ipfonehome.com.33004 > ipfonehome.com.snmp: GetRequest(39) system.sysUpTime.0 system.sysName[|snmp]
21:25:02.188916 IP ipfonehome.com.snmp > ipfonehome.com.33004: GetResponse(39) system.sysUpTime.0=106413 .iso.org.dod.internet.mgmt=[|snmp]

but now the process is stuck and hasn't turned off:

26616 ? Ss 0:00 /bin/sh -c /usr/bin/php -q /var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/poller.php >> /var/lo
26617 ? S 0:00 /usr/bin/php -q /var/www/vhosts/ipfonehome.com/subdomains/cacti/httpdocs/poller.php
26626 ? S 0:00 /usr/local/bin/.libs/lt-rrdtool -
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

sorry im making this post horrible. this was in the log file..

02/02/2007 09:29:54 PM - POLLER: Poller[0] Maximum runtime of 292 seconds exceeded. Exiting.

02/02/2007 09:29:54 PM - SYSTEM STATS: Time:292.9329 Method:cmd.php Processes:1 Threads:N/A Hosts:3 HostsPerProcess:3 DataSources:12 RRDsProcessed:0

so.... even though the poller hangs, it receives the query after it forces itself to exit. thats why i see the snmp traffic. however, it is probably not logging it because by that time it has already given up. does that sound correct? sounds like the scenario to me. now that i have the community set again, my graphs have stopped graphing..
allenelson
Posts: 18
Joined: Sun Jan 28, 2007 11:00 pm

Post by allenelson »

and one last one before bed. i now see some mysql issues.

02/02/2007 09:34:25 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select poller_id,end_time from poller_time where poller_id = 0"
02/02/2007 09:34:25 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
02/02/2007 09:34:26 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select poller_id,end_time from poller_time where poller_id = 0"
02/02/2007 09:34:26 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
02/02/2007 09:34:27 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select poller_id,end_time from poller_time where poller_id = 0"
02/02/2007 09:34:27 PM - CMDPHP: Poller[0] ERROR: SQL Assoc Failed "select poller_output.output, poller_output.time, poller_output.local_data_id, poller_item.rrd_path, poller_item.rrd_name, poller_item.rrd_num from (poller_output,poller_item) where (poller_output.local_data_id=poller_item.local_data_id and poller_output.rrd_name=poller_item.rrd_name) LIMIT 10000"
02/02/2007 09:34:54 PM - POLLER: Poller[0] Maximum runtime of 292 seconds exceeded. Exiting.
02/02/2007 09:34:54 PM - SYSTEM STATS: Time:292.3237 Method:cmd.php Processes:1 Threads:N/A Hosts:3 HostsPerProcess:3 DataSources:12 RRDsProcessed:0
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests