Hardware requierements

anicetog · Post by **anicetog** » Fri May 08, 2009 12:13 pm

We have been running cacti for about a year or so, but the last months it has begun to act strange. It does not print graphs anymore, and system load seem to be quite high all the time (I'm talking of loads os 14 or so)

We love cacti, and maybe we have populated our server qith so many graphs it can no loger work fine. The latest graphs we've added are no longer being plotted, but the ones we added in the beggining work just fine!

Right now we have about 1300 graphs, and we're running it on a server with 4x3GHz Intel Xeon processor with 2GB RAM. This server is hosting Nagios, as well.

¿Do you think we should move to something bigger, or maybe add more RAM to the server?

If you think our server is fine, then could you please guide us on finding why it doesn't plot graphs any longer?

Thanks a lot.

nebj00la · Post by **nebj00la** » Fri May 08, 2009 2:03 pm

anicetog wrote:If you think our server is fine, then could you please guide us on finding why it doesn't plot graphs any longer?
Thanks a lot.

Post the contents of the cacti.log file regarding to the poller statistics. Are you using cmd.php or spine?

Post by **gandalf** » Sat May 09, 2009 3:28 pm

anicetog wrote:Right now we have about 1300 graphs, and we're running it on a server with 4x3GHz Intel Xeon processor with 2GB RAM. This server is hosting Nagios, as well.

When ignoring Nagios, I'd say that's a laugh for cacti (when using spine). Surely, RAM should be increased. We run cacti on a Dual XEON that's about 4 years old with about 40k Data Sources (that's the relevant item, not the amount of graphs).
But from our Nagios installation (yes, we have this as well) I know that on a VERY decend server rrdcached was required as soon as we reached 20k rrd files. So nagious _may_ interfere with cacti especially when it comes to disk usage.
Make sure to use spine as a first step
Reinhard

anicetog · Post by **anicetog** » Fri May 22, 2009 1:46 am

We're using spine to poll, and we have 1666 data sources. I've been taking a look to the server, and seems like the poller process doesn't have time enough to go through all the servers.

I mean, the poller runs every 5 minutes, and if I 'top' te server, I can see many polling processes, but they don't finish before the next poll... And, in all this time, load goes high and CPU is used at near 100%,

I think it has something to do with Spine Specific Execution Parameters, but can't manage to make it work.

Besides, we have a 100MB full-duplex link, would it help if we switched it up to 1GB full-duplex?

Thanks, guys!

nebj00la · Post by **nebj00la** » Fri May 22, 2009 8:56 am

anicetog wrote:We're using spine to poll, and we have 1666 data sources. I've been taking a look to the server, and seems like the poller process doesn't have time enough to go through all the servers.

I think it has something to do with Spine Specific Execution Parameters, but can't manage to make it work.

Besides, we have a 100MB full-duplex link, would it help if we switched it up to 1GB full-duplex?

~1500 data sources would give you no reason, in my opinion, to adjust the default spine parameters. I've had up to ~8000 and I barely had to adjust them. I doubt the link has anything to do with it, since SNMP/diag traffic goes nowhere near 100MB/s!

Post the entry from the cacti.log that shows spine statistics, if you can.

How many hosts/services are you polling with Nagios? How often?

anicetog · Post by **anicetog** » Mon May 25, 2009 1:29 am

This is the data from Nagios:

Service Check Execution Time: 0.01 / 15.17 / 0.771 sec
Service Check Latency: 0.00 / 1.79 / 0.315 sec
Host Check Execution Time: 0.02 / 10.02 / 0.232 sec
Host Check Latency: 0.00 / 0.00 / 0.000 sec
# Active Host / Service Checks: 191 / 1503
# Passive Host / Service Checks: 0 / 0

And we are checking every 5 minutes.

As for the cacti log, is this what you need?

Code: Select all

05/25/2009 08:24:56 AM - SYSTEM STATS: Time:294.7881 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1084
05/25/2009 08:19:56 AM - SYSTEM STATS: Time:294.5974 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1021
05/25/2009 08:17:15 AM - RECACHE STATS: RecacheTime:138.8882 HostsRecached:2
05/25/2009 08:14:56 AM - SYSTEM STATS: Time:294.3617 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1108
05/25/2009 08:09:56 AM - SYSTEM STATS: Time:294.6788 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1108
05/25/2009 08:04:55 AM - SYSTEM STATS: Time:294.4917 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1108
05/25/2009 07:59:55 AM - SYSTEM STATS: Time:294.4942 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1108
05/25/2009 07:54:56 AM - SYSTEM STATS: Time:294.6869 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1108
05/25/2009 07:49:56 AM - SYSTEM STATS: Time:294.5400 Method:spine Processes:1 Threads:10 Hosts:79 HostsPerProcess:79 DataSources:2503 RRDsProcessed:1108

nebj00la · Post by **nebj00la** » Tue May 26, 2009 9:02 am

anicetog wrote:Active Host / Service Checks: 191 / 1503

Unless something is terribly wrong with spine, I think you should try running a poller cycle with Nagios stopped. Try it after hours, if you can. Neither instance (nagios/cacti) raises a red flag to me, so I would try this before making any further suggestions.

Cacti

Hardware requierements

Hardware requierements

Re: Hardware requierements

Re: Hardware requierements

Who is online