Compiled Spine--Cacti Vanished

Post support questions that directly relate to Linux/Unix operating systems.

Moderators: Developers, Moderators

Post Reply
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Compiled Spine--Cacti Vanished

Post by drogo »

Ok, I was giving Cacti a try and found that I was getting gaps in my data. It worked fine out the box on my test machine, but plenty of gaps on the production one. (Both are CentOS5.1)

I was running mrtg once a minute on the same machine and thought it might be an issue, so I stopped it. (I am intending to replace mrtg@1min with Cacti@5min) Still got the graph gaps. I thought I might need to go to Spine, so I downloaded all the header and dev packages and got it compiled. I then tried to go to cacti to change the poller, but got a blank screen at the main login page. I also get the following message in my httpd error_log file when I try to access cacti;

Code: Select all

[Mon Jun 02 14:32:11 2008] [error] [client 172.16.5.137] PHP Warning:  include_once(/var/www/html/cacti/lib/functions.php) [<a href='function.include-once'>function.include-once</a>]: failed to open stream: Permission denied in /var/www/html/cacti/include/global.php on line 185
[Mon Jun 02 14:32:11 2008] [error] [client 172.16.5.137] PHP Warning:  include_once() [<a href='function.include'>function.include</a>]: Failed opening '/var/www/html/cacti/lib/functions.php' for inclusion (include_path='.:/usr/share/pear') in /var/www/html/cacti/include/global.php on line 185
[Mon Jun 02 14:32:11 2008] [error] [client 172.16.5.137] PHP Fatal error:  Call to undefined function read_config_option() in /var/www/html/cacti/include/global_form.php on line 695
Any idea what I've borked up when compiling spine?

Thanks!
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

I suppose different issues got mixed up here.
First, I doubt that spine will cause those http errors. Verify by switching back to cmd.php, please.
Next, let's tackle the gaps issue. Please be more verbose on it (and lets stick to cmd.php until resolved).
Reinhard
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

Sorry about that. I was trying to search the forum since there's a ton of great info here and fix the gap issue on my own. I thought I had it, but apparently not. :-)

Ok, well I think I'm still on cmd.php. In my etc/crontab, I have the following line;

Code: Select all

*/5 * * * * root php /var/www/html/cacti/poller.php > /dev/null 2>&1
I thought I would have to change the poller in the web-based interface first, so I hadn't changed it in the crontab yet. Then when I went to the web site, I got the blank page.

I am able to run the "php /var/www/html/cacti/poller.php" command without issue.

Edit:The login page is still not showing.

Thanks!
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

Also, here are some excerpts from my cacti.log;

Code: Select all

05/30/2008 04:55:04 AM - SYSTEM STATS: Time:2.4829 Method:cmd.php Processes:1 Threads:N/A Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
05/30/2008 05:00:03 AM - SYSTEM STATS: Time:1.2277 Method:cmd.php Processes:1 Threads:N/A Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
05/30/2008 05:05:03 AM - CMDPHP: Poller[0] ERROR: A DB Exec Failed!, Error:'1062', SQL:"insert into poller_output (local_data_id, rrd_name, time, output) values (3, 'mem_buffers', '2008-05-30 05:05:03', '91104')'
05/30/2008 05:05:03 AM - CMDPHP: Poller[0] ERROR: A DB Exec Failed!, Error:'1062', SQL:"insert into poller_output (local_data_id, rrd_name, time, output) values (4, 'mem_swap', '2008-05-30 05:05:03', '2031608')'
05/30/2008 05:05:03 AM - CMDPHP: Poller[0] ERROR: A DB Exec Failed!, Error:'1062', SQL:"insert into poller_output (local_data_id, rrd_name, time, output) values (5, '', '2008-05-30 05:05:03', '1min:0.47 5min:0.21 10min:0.11')'
05/30/2008 05:05:04 AM - CMDPHP: Poller[0] ERROR: A DB Exec Failed!, Error:'1062', SQL:"insert into poller_output (local_data_id, rrd_name, time, output) values (6, 'users', '2008-05-30 05:05:03', '1')'
05/30/2008 05:05:05 AM - SYSTEM STATS: Time:2.1627 Method:cmd.php Processes:1 Threads:N/A Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:7
05/30/2008 05:10:03 AM - SYSTEM STATS: Time:1.4439 Method:cmd.php Processes:1 Threads:N/A Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
I wonder if these mysql errors are related to the graph gaps.

I also see this one a few times;

Code: Select all

05/30/2008 03:25:02 PM - CMDPHP: Poller[0] Host[4] ERROR: HOST EVENT: Host is DOWN Message: UDP ping Timed out
What other data could I provide that might help?
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

Ok, I got it back.

Turned out to be the patches. I went through my history to check my steps and ended up replacing the snmp.php and functions.php files with the originals from the cacti-0.8.7b.tar.gz file. The main login page now comes up again.

The gaps are still there, however. Should I go ahead and change over to spine now? Or keep working on the gap issue?
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

drogo wrote:Sorry about that. I was trying to search the forum since there's a ton of great info here and fix the gap issue on my own. I thought I had it, but apparently not. :-)

Ok, well I think I'm still on cmd.php. In my etc/crontab, I have the following line;

Code: Select all

*/5 * * * * root php /var/www/html/cacti/poller.php > /dev/null 2>&1
I thought I would have to change the poller in the web-based interface first, so I hadn't changed it in the crontab yet. Then when I went to the web site, I got the blank page.

I am able to run the "php /var/www/html/cacti/poller.php" command without issue.

Edit:The login page is still not showing.

Thanks!
The crontab entry is correct. Changing the poller is done only by using the web interface. So we have to get it up and running. Please see httpd's error log for hints
Reinhard
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

Looks like I was posting at the same time you were. :-)

I've got it back up and was able to login and change the poller to spine. I'm going to let it run over night and see how the gaps are in the morning.

Thanks!
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

Hmm, gaps are still present with Spine as the poller. I enabled spine and put in the path at about 1730 local time.

Output of cacti.log from over night;

Code: Select all

06/03/2008 05:05:01 PM - POLLER: Poller[0] ERROR: The path:  is invalid.  Can not continue
06/03/2008 05:10:01 PM - POLLER: Poller[0] ERROR: The path:  is invalid.  Can not continue
06/03/2008 05:15:02 PM - POLLER: Poller[0] ERROR: The path:  is invalid.  Can not continue
06/03/2008 05:20:02 PM - POLLER: Poller[0] ERROR: The path:  is invalid.  Can not continue
06/03/2008 05:25:01 PM - POLLER: Poller[0] ERROR: The path:  is invalid.  Can not continue
06/03/2008 05:30:01 PM - POLLER: Poller[0] ERROR: The path:  is invalid.  Can not continue
06/03/2008 05:35:03 PM - SYSTEM STATS: Time:1.1545 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 05:40:03 PM - SYSTEM STATS: Time:1.5285 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 05:50:03 PM - SYSTEM STATS: Time:2.1402 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:8
06/03/2008 05:55:03 PM - SYSTEM STATS: Time:1.1472 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:05:04 PM - SYSTEM STATS: Time:1.6058 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:10:03 PM - SYSTEM STATS: Time:1.1412 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:20:04 PM - SYSTEM STATS: Time:2.1493 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:25:03 PM - SYSTEM STATS: Time:1.1507 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:30:03 PM - SYSTEM STATS: Time:2.1517 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:35:03 PM - SYSTEM STATS: Time:1.1439 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:40:03 PM - SYSTEM STATS: Time:1.6270 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:10
06/03/2008 06:45:04 PM - SYSTEM STATS: Time:2.1434 Method:spine Processes:1 Threads:1 Hosts:7 HostsPerProcess:7 DataSources:13 RRDsProcessed:8
Here's a sample graph. I still have spine running on my test server pinging this same host and there are no gaps. So I know it didn't go down overnight.

Image
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Which ping script are you using? Are there gaps on other graphs as well? At the same point in time?
Reinhard
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

I'm using the built-in "Unix - Ping Host" method, if that makes sense. I have the ping method set to ICMP, and the poller.php was running as root when I was using cmd.php.

Is that what you're asking?


Yes, all the graphs have gaps at the same time, including the default localhost ones.
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

Same gaps= no problem with that script but a central problem with poller.php. Please switch verbosity to DEBUg and run poller.php --force --debug from command line
Reinhard
drogo
Posts: 10
Joined: Fri May 30, 2008 8:18 am

Post by drogo »

Ok, I got the debug information and started following the page in your sig about debugging NaN's in your graph. (I didn't know that's what gaps were.) In checking the values from step 7 (check rrd file numbers), I found the following values;

Code: Select all

ds[traffic_in]ds[ping].type = "GAUGE"
ds[ping].minimal_heartbeat = 600
ds[ping].min = 0.0000000000e+00
ds[ping].max = 5.0000000000e+03
ds[ping].last_ds = "0.801"
ds[ping].value = 2.4030000000e+00
ds[ping].unknown_sec = 0

That seems like it's ok in that the value is between the min and max. Or am I not on the correct track? The last value in the rrd file was a nan when I ran rrdtool fetch <filename> AVERAGE.

edit:added cacti.log file as attachment. Seemed too big to include in the post.
Attachments
cacti.txt
Cacti.log file
(65.96 KiB) Downloaded 98 times
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

When fetching data, the last line most often shows NaN. That's no problem. Surely, it is possible that a value "in between" exceeded your maximum. So you may want to work on it
Reinhard
Post Reply

Who is online

Users browsing this forum: No registered users and 3 guests