Hello,
We have two installations of Cacti, one an upgrade the other a new installation. Like others, we're having issues with the host disk partition queries; in our case the problem is on the box running the upgraded Cacti. Queries from the upgraded box to a WinXP server fail whereas queries from the freshly installed box to the same server work fine. Here's the specs on the two systems:
Upgraded Cacti:
Operating System: Red Hat 3.4.6-3.1
Webserver: Apache 2.2.4
Cacti: 0.8.6j
MySQL: 5.0.27
PHP: PHP 5.2.3 (cli) (built: Aug 30 2007 15:51:09)
RRDTool: 1.2.11
Net-SNMP: 5.1.2-11
Poller type: cmd.php
Freshly installed Cacti:
Operating System: OpenSuse 10.3
Webserver: Apache 2-2.2.4-70.2
Cacti: 0.8.6j-64.2
MySQL: 5.0.45-22
PHP: PHP 5.2.5 with Suhosin-Patch 0.9.6.2 (cli) (built: Dec 12 2007 03:47:43)
RRDTool: 1.2.23-47
Net-SNMP: 5.4.1-19
Poller type: cmd.php
The problem symptoms on the upgraded system are as follows:
1) Existing hosts report/graph disk used and disk new correctly.
2) Newly added hosts report either nan, 0 or only partial information. This varied as I deleted and re-created the Device. Currently reporting only Disk Total on 1 of 4 drives.
3) Increasing the SNMP timeout to 5 secs did nothing.
As suggested by a moderator, I set the log on the upgraded system to DEBUG and was able to capture the following:
01) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5117] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get used 6, output: 26565885952
02) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5117, 'hdd_used', '2008-01-14 21:40:55', '26565885952')"
03) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5117] WARNING: Result from SERVER not valid. Partial Result: 01/14/2008 09:40:56
04) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5117] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get total 6, output: U
05) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5117, 'hdd_total', '2008-01-14 21:40:55', 'U')"
06) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5116] WARNING: Result from SERVER not valid. Partial Result:
07) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5116] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get total 5, output: U
08) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5116, 'hdd_total', '2008-01-14 21:40:55', 'U')"
09) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5116] WARNING: Result from SERVER not valid. Partial Result:
10) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5116] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get used 5, output: U
11) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5116, 'hdd_used', '2008-01-14 21:40:55', 'U')"
12) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5114] WARNING: Result from SERVER not valid. Partial Result:
13) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5114] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get used 2, output: U
14) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5114, 'hdd_used', '2008-01-14 21:40:55', 'U')"
15) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5114] WARNING: Result from SERVER not valid. Partial Result:
16) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5114] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get total 2, output: U
17) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5114, 'hdd_total', '2008-01-14 21:40:55', 'U')"
18) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5115] WARNING: Result from SERVER not valid. Partial Result:
19) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5115] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get used 3, output: U
20) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5115, 'hdd_used', '2008-01-14 21:40:55', 'U')"
21) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5115] WARNING: Result from SERVER not valid. Partial Result:
22) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] Host[286] DS[5115] SERVER: /usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get total 3, output: U
23) 01/14/2008 09:40:56 PM - CMDPHP: Poller[0] DEBUG: SQL Exec: "insert into poller_output (local_data_id, rrd_name, time, output) values (5115, 'hdd_total', '2008-01-14 21:40:55', 'U')"
The first run of ss_host_disk.php (line 01) returned a correct value (which is providing the Disk Total being graphed successfully), so I'm wondering why the debug on line 03 reports that the result wasn't valid. As can be seen, no results afterward are considered valid. I see in cmd.php where $output = "U" (as shown in the debugs above) but as a novice php programmer, this is as far as I've come in deciphering what's going on at that point.
Running ss_host_disk.php within php script_server.php works fine:
$ php script_server.php
PHP Script Server has Started - Parent is cmd
/usr/local/apache2/htdocs/cacti/scripts/ss_host_disk.php ss_host_disk SERVERNAME 286 1:161:500:comm-string:::MD5::[None] get total 6
53678141440
quit
$
I can run manual queries like this on all the disks for both Total and Used without problem.
I've seen the postings that there's a problem within 8.7 with this query but haven't seen confirmation that the problem exists in 8.6j. If anyone could suggest where else to look or what else to try, I'd appreciate it.
Thanks
Brian Read
:D
Host Disk query failure - 8.6j on Red Hat querying WinXP box
Moderators: Developers, Moderators
Host Disk query failure - 8.6j on Red Hat querying WinXP box
Does anyone have any suggestions, ideas, etc pertaining to this problem? I'd be happy with any input at this point....
TIA,
BR
TIA,
BR
Same
I have been having the exact same problem to a T. I am trying to research it right now, but I haven't found anything useful yet. I have 8.7a and I am still having the problem.
I rebuilt my poller cache for all 3 hosts that were having problems as well, no change.
I rebuilt my poller cache for all 3 hosts that were having problems as well, no change.
Host Disk query failure - 8.6j on Red Hat querying WinXP box
I've been digging through Cacti code to try and understand what functions do what, what calls what, etc - still haven't figured out the problem. Just now starting to get an idea of how the overall system works. I'll have to punt soon as other priorities will take over, but if I find anything I'll post it here.
Good to know I'm not alone in this.
BR
Good to know I'm not alone in this.
BR
Host Disk query failure - 8.6j on Red Hat querying WinXP box
I think I've figured it out! I noticed that the poller wasn't completing within the 5 minute window. So I got to wondering if the system simply wasn't able to handle overlapping polling. I checked and Maximum Concurrent Poller Processes was set to a single process of cmd.php! I reset it to 3 and voila! Polling is completing in around 3:30. The queries are working and the graphs are populating! Yee haw!
Nothin' in the code, nothing with the cache, no problem anywhere except that Cacti was hamstringing itself with a single thread. I've added 5 more systems to be monitored - all working OK.
Hopefully my experiences will serve to help others - sometimes you can drill deep and be way off base. Chalk it up as a Cacti noobie learning experience.
BR
Nothin' in the code, nothing with the cache, no problem anywhere except that Cacti was hamstringing itself with a single thread. I've added 5 more systems to be monitored - all working OK.
Hopefully my experiences will serve to help others - sometimes you can drill deep and be way off base. Chalk it up as a Cacti noobie learning experience.
BR
Who is online
Users browsing this forum: No registered users and 8 guests