New graphs showing spotty data

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

New graphs showing spotty data

Post by sheriff-jms »

All:

I hacked up two quick PHP scripts to get the load percentage and output current draw per phase from our MIB-II compliant Liebert 3-phase UPSes (script works on single-phase units too). The scripts work fine when run from the command line - I get data, and the variable names output by the script appear to match the variable names that I've defined in the templates for these data points. Some of the data points show up in the graphs pretty reliably, but some will be missing from the graphs in a seemingly random fashion (see attached image).

The load on the box occasionally gets a little bit high - it also runs a fairly large MRTG installation that will be migrated into Cacti over time - but the spine process finishes in well under 5 minutes from what I can see, so I don't think I have an issue with concurrent spine processes stomping on each other.

I looked in the Cacti log file and I see an occasional entry like this:

Code: Select all

10/13/2009 10:45:08 AM - POLLER: Poller[0] WARNING: Poller Output Table not Empty. Issues Found: 1, Data Sources: (DS[253]) 
There is an entry in the poller cache for this graph, but I'm not sure if the data from the debug is complete.

Code: Select all

/usr/local/rrdtool/bin/rrdtool create \
/usr/local/apache2/htdocs/cacti/rra/33/253.rrd \
--step 300  \
DS:upsoutputloadphase1:GAUGE:600:0:1000 \
DS:upsoutputloadphase2:GAUGE:600:0:1000 \
DS:upsoutputloadphase3:GAUGE:600:0:1000 \
RRA:AVERAGE:0.5:1:500 \
RRA:AVERAGE:0.5:1:600 \
RRA:AVERAGE:0.5:6:700 \
RRA:AVERAGE:0.5:24:775 \
RRA:AVERAGE:0.5:288:797 \
RRA:MAX:0.5:1:500 \
RRA:MAX:0.5:1:600 \
RRA:MAX:0.5:6:700 \
RRA:MAX:0.5:24:775 \
RRA:MAX:0.5:288:797 \
I looked at the number of processes and connections into the mysql DB and I never saw the number of concurrent processes jump much higher than 5 or 6.

Any thoughts on what else I should look at to troubleshoot why the graphs are not filling in reliably would be appreciate.

Thanks in advance. Once I get this squared away, I'll be happy to make the PHP scripts I wrote and the XML templates available.
Attachments
UPS output load graph showing some data loss
UPS output load graph showing some data loss
cacti-upsoutput-20091013.png (22.96 KiB) Viewed 3207 times
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

Additional info:

The problem seems to be related to the PHP scripts, or Cacti's interaction with them. When I disable graphs that use these scripts, the "Poller Output Table not Empty" warnings stop in the Cacti log file. This is odd, because the output from the script doesn't seem to do anything wrong. The variable names they spit out match the variable names in the corresponding data input methods and the output appears to be formatted properly, compared to other PHP scripts I use (the Cisco WLC 4400 scripts definitely rock by the way :) ) when I run them from the command line.

I've also increased the memory available to PHP to 64 MB.

I'm not running boost at this point. This is a pretty vanilla 0.8.7e installation, with rrdtool 1.3.8.

I'm attaching the (slightly redacted) tech support output from cacti, just in case...
Attachments
foo.txt
Tech support output from cacti
(13.21 KiB) Downloaded 202 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Does that script spit out the data in ONE SINGLE print statement?
R.
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

If the UPS in question is a single-phase unit, it returns the data in one print statement. If it's a 3-phase unit, there is one print statement per phase.

The PHP to output the data looks like this:

Code: Select all

if (array_key_exists('0', $upsoutput)) {
        $upsoutputphase1=$upsoutput[0];
        print "upsoutputphase1:" . $upsoutputphase1 ;
        }
if (array_key_exists('1', $upsoutput)) {
        $upsoutputphase2=$upsoutput[1];
        print " upsoutputphase2:" . $upsoutputphase2 ;
        }
if (array_key_exists('2', $upsoutput)) {
        $upsoutputphase3=$upsoutput[2];
        print " upsoutputphase3:" . $upsoutputphase3 ;
        }

print "\n";
$upsoutput is an array that is populated by the results of polling the appropriate OID. If that OID returns a single line, the UPS is assumed to be a single-phase unit, and if it returns 3 rows, it is a 3-phase unit. The output section looks at the number of rows in the array and returns the data accordingly.

A sample run from the command-line against a single-phase unit looks like this:

Code: Select all

bash-2.05# php ./mib-ii_ups_output_current.php xx.xx.xx.138 SNMPCOMMUNITY 2
upsoutputphase1:32
and a 3-phase unit looks like this:

Code: Select all

bash-2.05# php ./mib-ii_ups_output_current.php xx.xx.xx.15 SNMPCOMMUNITY 2
upsoutputphase1:28 upsoutputphase2:47 upsoutputphase3:73
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

sheriff-jms wrote:If it's a 3-phase unit, there is one print statement per phase.
Then please fill a intermediate var instead and print that var at the end
R.
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

I re-wrote the output section of the PHP script to print all of the variables in one print statement. It's an ugly kludge, but it works more reliably than before. The remaining issue is that the output from the first phase does not appear to be getting recorded in the RRD file. Phases 2 and 3 seem to be OK. I don't see anything wrong with the output from the PHP script when I run it from the commandline:

Code: Select all

bash-2.05# php ./mib-ii_ups_output_current.php 10.245.248.15 watchdog 2
upsoutputampsphase1:28 upsoutputampsphase2:47 upsoutputampsphase3:73
There is still an issue with the PHP code where it bombs out if the UPS is a single-phase unit, but I'm not testing single-phase units in Cacti at the moment.
Attachments
3-phase UPS output per phase (amps).  Output from phase 1 is not being recorded in the RRD
3-phase UPS output per phase (amps). Output from phase 1 is not being recorded in the RRD
cacti-upsoutput-20091103.png (24.77 KiB) Viewed 2910 times
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

I fixed the problem where the PHP code was misbehaving when the UPS being polled was a single-phase unit, however Cacti is still not graphing them, nor is it graphing the first phase of a 3-phase unit (data points all come back as NaN). I'm assuming the issues are related/identical, but for the life of me I can't see what it is, other than Cacti being picky about how the input fields are being presented by the PHP script?

The output piece of the PHP script (it's an ugly hack, but it works ;) ):

Code: Select all

if (array_key_exists('0', $upsoutput)) {
        $upsoutputtemp1="\n".'upsoutputampsphase1:' . $upsoutput[0];
        $upsoutputall = $upsoutputtemp1."\n"; 
}
if (array_key_exists('1', $upsoutput)) {
        $upsoutputtemp2=' upsoutputampsphase2:' . $upsoutput[1];
        $upsoutputall = $upsoutputtemp1.$upsoutputtemp2."\n";   
}
if (array_key_exists('2', $upsoutput)) {
        $upsoutputtemp3=' upsoutputampsphase3:' . $upsoutput[2];
        $upsoutputall = $upsoutputtemp1.$upsoutputtemp2.$upsoutputtemp3."\n";
}
 
print $upsoutputall;
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

I'm still fighting with getting Cacti to accept the first value that is printed by the script. This has the effect of either preventing Cacti from displaying any data for a single-phase UPS, or preventing Cacti from displaying data for the first phase of a 3-phase UPS. I've gone as far as trying to insert a dummy variable and value before the first 'real' data point that's generated by the PHP script. No luck. It's almost like Cacti is insisting on doing a priming read of some sort and in the process, is throwing away the first value it sees.
From the commandline, the PHP script output appears to be properly formatted, and the output is done by a single print statement.

The names in the PHP script match the variable names that are defined in the data input method and carried into the graph template.

Does anyone have any insight as to why Cacti is ignoring the first data point I print out?
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Please post the lines found at System Utilities -> View Poller Cache for that host/data query
R.
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

Attached is a screenshot of the poller cache view for that device. I've also included screenshots of the data source debug for the output current graph (first data point not being graphed) and one for the output load graph (all three data points on a 3-phase UPS work and are being graphed).

I've been digging into this a bit more, and it seems like something is wrong with how the data definition for the "upsoutputampsphase1" data point is telling Cacti how to handle that data. It makes sense that this affects the graphing for both single and 3-phase systems, because 1. the same PHP script is called in both instances and 2. the single-phase definitions and templates are pretty much a clone of the 3-phase ones, with the extra variables removed.

The thing I'm still wrestling with at this point is what could be wrong with how "upsoutputampsphase1" is defined (in either the single or 3-phase templates) in Cacti that is different from the other data points. An "rrdtool info" on the output current graph seems to show the correct data point names, though you can see the value for the the first data point is unknown:

Code: Select all

bash-2.05# rrdtool info 254.rrd 
filename = "254.rrd"
rrd_version = "0003"
step = 300
last_update = 1257948791
ds[upsoutputampsphase1].type = "GAUGE"
ds[upsoutputampsphase1].minimal_heartbeat = 600
ds[upsoutputampsphase1].min = 0.0000000000e+00
ds[upsoutputampsphase1].max = 1.0000000000e+03
ds[upsoutputampsphase1].last_ds = "U"
ds[upsoutputampsphase1].value = NaN
ds[upsoutputampsphase1].unknown_sec = 191
ds[upsoutputampsphase2].type = "GAUGE"
ds[upsoutputampsphase2].minimal_heartbeat = 600
ds[upsoutputampsphase2].min = 0.0000000000e+00
ds[upsoutputampsphase2].max = 1.0000000000e+03
ds[upsoutputampsphase2].last_ds = "82"
ds[upsoutputampsphase2].value = 1.5662000000e+04
ds[upsoutputampsphase2].unknown_sec = 0
ds[upsoutputampsphase3].type = "GAUGE"
ds[upsoutputampsphase3].minimal_heartbeat = 600
ds[upsoutputampsphase3].min = 0.0000000000e+00
ds[upsoutputampsphase3].max = 1.0000000000e+03
ds[upsoutputampsphase3].last_ds = "59"
ds[upsoutputampsphase3].value = 1.1269000000e+04
ds[upsoutputampsphase3].unknown_sec = 0
An "rrdtool info" against the RRD for the output load graph (all three data points work correctly) has different data point names, but all three are getting populated correctly.
Attachments
Poller cache info
Poller cache info
cacti-poller-cache-1-20091111.png (28.83 KiB) Viewed 2720 times
Data source debug for 3-phase output current graph.
Data source debug for 3-phase output current graph.
cacti-ds-debug-1-20091111.png (46.81 KiB) Viewed 2720 times
Data source debug for the 3-phase output load graph.  Everything is working correctly here.
Data source debug for the 3-phase output load graph. Everything is working correctly here.
cacti-ds-debug-2-20091111.png (59.1 KiB) Viewed 2720 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Just as a try: Please change the name of those output parameters and avoid those numbers. Change e.g. to a letter and retry.
R.
sheriff-jms
Posts: 13
Joined: Wed Oct 07, 2009 12:44 pm

Post by sheriff-jms »

The data seems to be filling in reliably now for a 3-phase UPS. There is an issue with single-phase units, but I'm pretty certain that's a definition issue that I need to track down somewhere in Cacti.

Thank you for your help on this. I'll post the scripts and templates once I get the single-phase issue worked out.
Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest