Cacti not graphing properly but Zenoss is?

MoreDakka · Post by **MoreDakka** » Tue Apr 14, 2009 11:16 am

Hey,

I'm really at a loss here. I've got Zenoss and Cacti running on the same box. They are both access SNMP on the same APC UPS. However my cacti graphs look like:

and my zenoss graphs are complete like:

This is happening with all the devices in Cacti, they are all broken graphs like that. All of the Zenoss graphs are complete, not broken at all.

Any assistance would be great.

Thanks.

BSOD2600 · Post by **BSOD2600** » Tue Apr 14, 2009 12:05 pm

Follow http://docs.cacti.net/manual:087:4_help ... #debugging

MoreDakka · Post by **MoreDakka** » Tue Apr 14, 2009 2:06 pm

Well, going through the first step, this error actually went away for a bit but has come back up:

04/14/2009 12:40:01 PM - POLLER: Poller[0] WARNING: Poller Output Table not Empty. Issues Found: 3155, Data Sources: traffic_in(DS[912]), traffic_out(DS[912]), traffic_in(DS[913]), traffic_out(DS[913]), traffic_in(DS[914]), traffic_out(DS[914]), traffic_in(DS[915]), traffic_out(DS[915]), traffic_in(DS[916]), traffic_out(DS[916]), traffic_in(DS[917]), traffic_out(DS[917]), traffic_in(DS[918]), traffic_out(DS[918]), traffic_in(DS[919]), traffic_out(DS[919]), traffic_in(DS[920]), traffic_out(DS[920]), traffic_in(DS[921]), traffic_out(DS[921]), traffic_in(DS[922]), Additional Issues Remain. Only showing first 20
04/14/2009 12:38:17 PM - CMDPHP: Poller[0] Host[155] DS[2634] WARNING: Result from CMD not valid. Partial Result: U

The last part of that log, Partial Result: U, that's happening with a lot of hosts. I don't know why.

I will continue the debug to see if I can find the root to this problem but thought I would update this post with findings in case someone can tell me the fix for the above problem.

Thanks.

MoreDakka · Post by **MoreDakka** » Tue Apr 14, 2009 2:38 pm

Found this while doing the rrdtool tests:

[root@monitor rra]# rrdtool fetch device_ping_2583.rrd AVERAGE
1239730500: nan
1239730800: nan
1239731100: nan
1239731400: 1.1300000000e+02
1239731700: 1.1300000000e+02
1239732000: nan
1239732300: nan
1239732600: nan
1239732900: nan
1239733200: nan
1239733500: nan
1239733800: nan
1239734100: nan
1239734400: nan
1239734700: nan
1239735000: nan
1239735300: nan
1239735600: nan
1239735900: nan
1239736200: nan
1239736500: nan
1239736800: nan
1239737100: nan
1239737400: nan

(There is a lot more but I don't think you want to see it)

So I ran this:
[root@monitor rra]# rrdtool info device_ping_2583.rrd
filename = "device_ping_2583.rrd"
rrd_version = "0003"
step = 300
last_update = 1239731835
ds[ping].type = "GAUGE"
ds[ping].minimal_heartbeat = 600
ds[ping].min = 0.0000000000e+00
ds[ping].max = 5.0000000000e+03
ds[ping].last_ds = "UNKN"
ds[ping].value = 1.5255000000e+04
ds[ping].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 700
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 2.2600000000e+02
rra[1].cdp_prep[0].unknown_datapoints = 3
rra[2].cf = "AVERAGE"
rra[2].rows = 775
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 2.2600000000e+02
rra[2].cdp_prep[0].unknown_datapoints = 21
rra[3].cf = "AVERAGE"
rra[3].rows = 797
rra[3].pdp_per_row = 288
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 1.6606226667e+03
rra[3].cdp_prep[0].unknown_datapoints = 194
rra[4].cf = "MIN"
rra[4].rows = 600
rra[4].pdp_per_row = 1
rra[4].xff = 5.0000000000e-01
rra[4].cdp_prep[0].value = NaN
rra[4].cdp_prep[0].unknown_datapoints = 0
rra[5].cf = "MIN"
rra[5].rows = 700
rra[5].pdp_per_row = 6
rra[5].xff = 5.0000000000e-01
rra[5].cdp_prep[0].value = 1.1300000000e+02
rra[5].cdp_prep[0].unknown_datapoints = 3
rra[6].cf = "MIN"
rra[6].rows = 775
rra[6].pdp_per_row = 24
rra[6].xff = 5.0000000000e-01
rra[6].cdp_prep[0].value = 1.1300000000e+02
rra[6].cdp_prep[0].unknown_datapoints = 21
rra[7].cf = "MIN"
rra[7].rows = 797
rra[7].pdp_per_row = 288
rra[7].xff = 5.0000000000e-01
rra[7].cdp_prep[0].value = 7.2077000000e+01
rra[7].cdp_prep[0].unknown_datapoints = 194
rra[8].cf = "MAX"
rra[8].rows = 600
rra[8].pdp_per_row = 1
rra[8].xff = 5.0000000000e-01
rra[8].cdp_prep[0].value = NaN
rra[8].cdp_prep[0].unknown_datapoints = 0
rra[9].cf = "MAX"
rra[9].rows = 700
rra[9].pdp_per_row = 6
rra[9].xff = 5.0000000000e-01
rra[9].cdp_prep[0].value = 1.1300000000e+02
rra[9].cdp_prep[0].unknown_datapoints = 3
rra[10].cf = "MAX"
rra[10].rows = 775
rra[10].pdp_per_row = 24
rra[10].xff = 5.0000000000e-01
rra[10].cdp_prep[0].value = 1.1300000000e+02
rra[10].cdp_prep[0].unknown_datapoints = 21
rra[11].cf = "MAX"
rra[11].rows = 797
rra[11].pdp_per_row = 288
rra[11].xff = 5.0000000000e-01
rra[11].cdp_prep[0].value = 1.1300000000e+02
rra[11].cdp_prep[0].unknown_datapoints = 194
rra[12].cf = "LAST"
rra[12].rows = 600
rra[12].pdp_per_row = 1
rra[12].xff = 5.0000000000e-01
rra[12].cdp_prep[0].value = NaN
rra[12].cdp_prep[0].unknown_datapoints = 0
rra[13].cf = "LAST"
rra[13].rows = 700
rra[13].pdp_per_row = 6
rra[13].xff = 5.0000000000e-01
rra[13].cdp_prep[0].value = 1.1300000000e+02
rra[13].cdp_prep[0].unknown_datapoints = 3
rra[14].cf = "LAST"
rra[14].rows = 775
rra[14].pdp_per_row = 24
rra[14].xff = 5.0000000000e-01
rra[14].cdp_prep[0].value = 1.1300000000e+02
rra[14].cdp_prep[0].unknown_datapoints = 21
rra[15].cf = "LAST"
rra[15].rows = 797
rra[15].pdp_per_row = 288
rra[15].xff = 5.0000000000e-01
rra[15].cdp_prep[0].value = 1.1300000000e+02
rra[15].cdp_prep[0].unknown_datapoints = 194

there seems to be a lot of unknown datapoints. Is that normal?

Siryx · Post by **Siryx** » Tue Apr 14, 2009 3:41 pm

I got the same problem....

04/14/2009 01:30:02 PM - CMDPHP: Poller[0] Host[1] DS[906] WARNING: Result from SNMP not valid. Partial Result: U

I got this message with almost all host... I upgrade from Version 0.8.7b to 0.8.7d.. but the problem continues.

thanks if anyone can help!

Amadeus Zull · Post by **Amadeus Zull** » Tue Apr 14, 2009 3:46 pm

MoreDakka wrote:Well, going through the first step, this error actually went away for a bit but has come back up:

04/14/2009 12:40:01 PM - POLLER: Poller[0] WARNING: Poller Output Table not Empty. Issues Found: 3155, Data Sources: traffic_in(DS[912]), traffic_out(DS[912]), traffic_in(DS[913]), traffic_out(DS[913]), traffic_in(DS[914]), traffic_out(DS[914]), traffic_in(DS[915]), traffic_out(DS[915]), traffic_in(DS[916]), traffic_out(DS[916]), traffic_in(DS[917]), traffic_out(DS[917]), traffic_in(DS[918]), traffic_out(DS[918]), traffic_in(DS[919]), traffic_out(DS[919]), traffic_in(DS[920]), traffic_out(DS[920]), traffic_in(DS[921]), traffic_out(DS[921]), traffic_in(DS[922]), Additional Issues Remain. Only showing first 20
04/14/2009 12:38:17 PM - CMDPHP: Poller[0] Host[155] DS[2634] WARNING: Result from CMD not valid. Partial Result: U

The last part of that log, Partial Result: U, that's happening with a lot of hosts. I don't know why.

I will continue the debug to see if I can find the root to this problem but thought I would update this post with findings in case someone can tell me the fix for the above problem.

Thanks.

are you using the poller.php or cmd.php in your cron?

MoreDakka · Post by **MoreDakka** » Tue Apr 14, 2009 3:51 pm

Using poller.php. Tried it with the --force command as well (someone said that worked for them, not working for me)

[root@monitor rra]# su cactiuser
bash-3.1$ crontab -l
*/5 * * * * php /var/www/html/cacti/poller.php --force > /dev/null 2>&1

Amadeus Zull · Post by **Amadeus Zull** » Tue Apr 14, 2009 3:52 pm

Replace poller.php with cmd.php. I had a similar issue and that resolved it.

Cheers,
Zull

Siryx · Post by **Siryx** » Tue Apr 14, 2009 4:19 pm

Amadeus Zull wrote:Replace poller.php with cmd.php. I had a similar issue and that resolved it.

Cheers,
Zull

I'm working in Ubuntu....

i change in /etc/cron.d/cacti from poller.php to cmd.php and stop graphing at all. :S

MoreDakka · Post by **MoreDakka** » Tue Apr 14, 2009 4:28 pm

Mine has gone from no graphing with errors to no graphing without errors. Not sure if that is a step ahead or back

BSOD2600 · Post by **BSOD2600** » Tue Apr 14, 2009 4:53 pm

Amadeus Zull wrote:Replace poller.php with cmd.php. I had a similar issue and that resolved it.

You do NOT want to do that -- always use poller.php. It will decide to launch cmd.php or Spine for the data collection.

MoreDakka: Look through your log file for where cacti polls that broken Battery Time remaining data source. Paste the data collection and rrdtool update chunks for it.

MoreDakka · Post by **MoreDakka** » Tue Apr 14, 2009 5:06 pm

Ok, changed back to poller.php.

Also, not exactly sure what you are looking for. Found the log for the datasource:

/usr/bin/rrdtool create \
/var/www/html/cacti/rra/device_ups_battruntimeremain_2007.rrd \
--step 300 \
DS:battRunTimeRemain:GAUGE:600:0:U \
RRA:AVERAGE:0.5:1:600 \
RRA:AVERAGE:0.5:6:700 \
RRA:AVERAGE:0.5:24:775 \
RRA:AVERAGE:0.5:288:797 \
RRA:MIN:0.5:1:600 \
RRA:MIN:0.5:6:700 \
RRA:MIN:0.5:24:775 \
RRA:MIN:0.5:288:797 \
RRA:MAX:0.5:1:600 \
RRA:MAX:0.5:6:700 \
RRA:MAX:0.5:24:775 \
RRA:MAX:0.5:288:797 \
RRA:LAST:0.5:1:600 \
RRA:LAST:0.5:6:700 \
RRA:LAST:0.5:24:775 \
RRA:LAST:0.5:288:797 \

Is that what you were looking for?

I'm using the UPS as an example. All of the devices on my cacti box are doing the same thing, blank graphs or extremely choppy (like updated once per day)

BSOD2600 · Post by **BSOD2600** » Tue Apr 14, 2009 8:39 pm

You want the cacti log in with debugging logging enabled -- per the guide I linked.

MoreDakka · Post by **MoreDakka** » Wed Apr 15, 2009 7:43 am

Sorry, is this what you are looking for?

04/15/2009 06:41:54 AM - CMDPHP: Poller[0] Host[77] DS[2007] SNMP: v1: <IP removed>, dsname: battRunTimeRemain, oid: 1.3.6.1.4.1.318.1.1.1.2.2.3.0, output: 186000

Siryx · Post by **Siryx** » Thu Apr 16, 2009 9:10 am

no one could give us a hand?

Cacti

Cacti not graphing properly but Zenoss is?

Cacti not graphing properly but Zenoss is?

same problem with broken graphs here

Who is online