New Cacti Architecture (0.8.8) - RFC Response Location
Moderators: Developers, Moderators
The distributed poller design will have to take into account time discrepancies on all of the machines. The RRD files are updated with single second accuracy so the poller will have to either be in complete sync with the main storage host or the timestamp used for the RRD data will have to be generated by the main host and used by the remote pollers for their inserts.
Along with multiple pollers by location, why not try to distribute the poller load across the polling interval. Currently, every 5 minutes, the poller starts and collects the data for all the hosts in the system causing a large spike in traffic and system utilization. Once the polling process is completed, the system sits idle until the next polling cycle starts when the process is repeated.
We could even out the load on a single system by dividing all of the hosts into multiple pollers and having each poller run on its own 5 minute interval. If the polling interval is 5 minutes and we have 500 hosts, the first poller would start at 0:00 and collect data from its assigned 100 hosts. The second poller would start at 0:01 and collect data from its assigned 100 hosts, etc.
This would allow the peak I/O load on the poller to be reduced and allow all of the updates, network access and RRD writes to be distributed across the entire polling cycle.
Along with multiple pollers by location, why not try to distribute the poller load across the polling interval. Currently, every 5 minutes, the poller starts and collects the data for all the hosts in the system causing a large spike in traffic and system utilization. Once the polling process is completed, the system sits idle until the next polling cycle starts when the process is repeated.
We could even out the load on a single system by dividing all of the hosts into multiple pollers and having each poller run on its own 5 minute interval. If the polling interval is 5 minutes and we have 500 hosts, the first poller would start at 0:00 and collect data from its assigned 100 hosts. The second poller would start at 0:01 and collect data from its assigned 100 hosts, etc.
This would allow the peak I/O load on the poller to be reduced and allow all of the updates, network access and RRD writes to be distributed across the entire polling cycle.
I'm late to this discussion too... What about extending the Boost plugins 'RRD Server' application to be the interface to access the remote RRD's?tianye wrote:Hello all,
distributed polling was one of the topics discussed during the 3rd European Cacti Community Conference this weekend.
I tried to put together the key points of our brainstorming. Most ideas have already been around in this thread but I thought it's best to give a complete wrap up of our discussion today, even it that means to repeat a lot things.
The input below came from all participants of the conference but of course I take all responsibilities for errors or missing points
Scenarios
The overall idea would be to have one central server and a number of agents that are just polling hosts.
The central server is used for the graphic interface, that means configuring Cacti and accessing graphs, and for keeping the rrd files.
The polling agents are just polling data (as the name already suggests) and report the results back to the central server.
This mechanism can be useful in some scenarios, for example
- Remote probes
Imagine a data center with your application server and your cacti server. The application is accessed from remote branches and you have to monitor your application from the user's point of view and not just the central system.
A polling agent can be place in the remote branch to achieve that. This can be helpful for trouble-shooting or even by recovered by your SLAs.
- Latency issues
Imagine a central cacti server monitoring devices that a far away, or, in network lingo, the connection has a high latency.
For only a few devices this might be possible, but with a growing number of devices this can slow down the overall performance of your cacti.
- Remote networks
Sometimes there's the need to monitor devices in a network that is due to routing constraints not easily reachable, e.g. networks with overlapping ip address scheme when using private ip addresses.
Accessing a small number of devices can be achieved through NAT or a tunnel, but this is getting difficult with a large number of devices.
Right now cacti directly accesses the RRD's, but if the underlying architecture was changed to allow it to connect to a socket and interact with the RRD's for creating graphs like boost does when it updates the RRD's, then we're 1/2 way there... Other things to do would be to create the extension to add/delete graphs on remote collection 'agents', and configure them remotely. OR a 'graph only' host could be created that does nothing but query the remote host for it's devices/settings and presents the graphs through interaction of the RRD server from boost.
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
All,
Just an update. There have been a few things from my first diagram that have come to light recently. Here are the summaries:
1) Spine 0.8.7d Supports the concept of a poller_id
2) Cacti 0.8.8 SVN Now supports the concept of a poller
However, there is still quite a bit to do to finish things off. Here are the finishers:
1) QA the Poller concept to confirm operation for plugin developers. Per Reinhard, some plugins may require access to either RRDfiles or poller_output data or have a poller bottom on the remote poller machine.
2) Toy with the concept of a local poller cache and poller output in cases where the master server is not available.
3) Create a remote poller package that includes a poller.php, database.php, config.php, global.php, etc. If deployed as a daemon, how to reconcile Windows vs. Linux. Daemons in PHP are dangerous. (version dependent)
Larry
Just an update. There have been a few things from my first diagram that have come to light recently. Here are the summaries:
1) Spine 0.8.7d Supports the concept of a poller_id
2) Cacti 0.8.8 SVN Now supports the concept of a poller
However, there is still quite a bit to do to finish things off. Here are the finishers:
1) QA the Poller concept to confirm operation for plugin developers. Per Reinhard, some plugins may require access to either RRDfiles or poller_output data or have a poller bottom on the remote poller machine.
2) Toy with the concept of a local poller cache and poller output in cases where the master server is not available.
3) Create a remote poller package that includes a poller.php, database.php, config.php, global.php, etc. If deployed as a daemon, how to reconcile Windows vs. Linux. Daemons in PHP are dangerous. (version dependent)
Larry
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Sorry if it already has in RFC, but - is any chance to see InnoDB with using foreight keys DB structure and more clear tables descriptions/names? Now it looks too confusing (I was shocked then I found SQL-query for change from 32bit counters to 64bit: about 6-8 JOINS, it looks crazy for me).
Would you like to use new RRD Caching Daemon?
Would you like to use new RRD Caching Daemon?
- TheWitness
- Developer
- Posts: 17047
- Joined: Tue May 14, 2002 5:08 pm
- Location: MI, USA
- Contact:
Caching daemon is in. I am taking a break in mid December (vacation actually) and will be doing some cacti things related to polling and trees.
More details to follow.
TheWitness
More details to follow.
TheWitness
True understanding begins only when we realize how little we truly understand...
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Life is an adventure, let yours begin with Cacti!
Author of dozens of Cacti plugins and customization's. Advocate of LAMP, MariaDB, IBM Spectrum LSF and the world of batch. Creator of IBM Spectrum RTM, author of quite a bit of unpublished work and most of Cacti's bugs.
_________________
Official Cacti Documentation
GitHub Repository with Supported Plugins
Percona Device Packages (no support)
Interesting Device Packages
For those wondering, I'm still here, but lost in the shadows. Yearning for less bugs. Who want's a Cacti 1.3/2.0? Streams anyone?
Good morning! (GMT+2)
I'm waiting for this great work 'cause I've to groove up my cacti installation with frontends, centralized DB, distributed RRD-files and I think it's better to wait this kind of solution, better supported, than creating something inhome.
Waiting for infos or betas to try
Are there any news about the scenario?
Simon
I'm waiting for this great work 'cause I've to groove up my cacti installation with frontends, centralized DB, distributed RRD-files and I think it's better to wait this kind of solution, better supported, than creating something inhome.
Waiting for infos or betas to try
Are there any news about the scenario?
Simon
-
- Posts: 4
- Joined: Thu Feb 14, 2008 11:54 am
Cacti clustered on version 0.8.7e
Hi all,
I have changed an older clustered code to run under the latest version of cacti. It load balance and should properly fail over. Since it hasn't been tested a long time and on different conditions, it might be it is not 100%.
It has been running for few months now on a production instance I deployed and in overall it is running properly.
Both servers are exchanging properly the data of the RRD files. We didn't tested completely the fail-over, but it should be working.
After some more test, I would be happy to have this code included on the core code of Cacti. Since this job was done for the company I am working currently, it would be nice, if you decide to incorporate this code into the main Cacti code, to mention both me and the company as contributors on the Cacti project.
If there is any body interested, let me know.
Well, looking for testers!!!
Cheers,
Fausto
I have changed an older clustered code to run under the latest version of cacti. It load balance and should properly fail over. Since it hasn't been tested a long time and on different conditions, it might be it is not 100%.
It has been running for few months now on a production instance I deployed and in overall it is running properly.
Both servers are exchanging properly the data of the RRD files. We didn't tested completely the fail-over, but it should be working.
After some more test, I would be happy to have this code included on the core code of Cacti. Since this job was done for the company I am working currently, it would be nice, if you decide to incorporate this code into the main Cacti code, to mention both me and the company as contributors on the Cacti project.
If there is any body interested, let me know.
Well, looking for testers!!!
Cheers,
Fausto
-
- Posts: 4
- Joined: Thu Feb 14, 2008 11:54 am
Hi All,
Yes, I agree, that's why I am happy to share it.
I will organize myself to send a trial version for testing on the coming week or next.
Then we can start discussing how to integrate in the core code of Cacti, if this is of interest from the main developers. Could any of the main developer of Cacti comment on it?
Cheers,
Fausto
Yes, I agree, that's why I am happy to share it.
I will organize myself to send a trial version for testing on the coming week or next.
Then we can start discussing how to integrate in the core code of Cacti, if this is of interest from the main developers. Could any of the main developer of Cacti comment on it?
Cheers,
Fausto
-
- Posts: 4
- Joined: Thu Feb 14, 2008 11:54 am
Hi,
Here it goes for testing the clustered patch I created. The attachments contains the patch and a guide for setup of the environment.
Please have a go and let me know if this is something that you are interested on.
There are still room for improvement, as I know, but it is a good starting point for a proper cluster support on Cacti.
Cheers,
Fausto
Here it goes for testing the clustered patch I created. The attachments contains the patch and a guide for setup of the environment.
Please have a go and let me know if this is something that you are interested on.
There are still room for improvement, as I know, but it is a good starting point for a proper cluster support on Cacti.
Cheers,
Fausto
- Attachments
-
- cacti-0.8.7e-clustered.zip
- (119.67 KiB) Downloaded 3282 times
What is the progress on that topic since last year ? It's really a feature that is lacking when you compare Cacti to other metrology solutions.
I'm currently thinking about a solution where I had several Cacti. I would need to develop some kind of frontend to try to make this distribution kind of transparent to the end user. But this solution is obviously not optimal.
Are you still looking for comments to this RFC or have the technical choices already been taken ?
I'm currently thinking about a solution where I had several Cacti. I would need to develop some kind of frontend to try to make this distribution kind of transparent to the end user. But this solution is obviously not optimal.
Are you still looking for comments to this RFC or have the technical choices already been taken ?
Cedric Girard
Re:
+1 I too would like to know where this is at, I have a few large environments where I would like to have remote pollers that report back to a centralised instance.X-dark wrote:What is the progress on that topic since last year ? It's really a feature that is lacking when you compare Cacti to other metrology solutions.
I'm currently thinking about a solution where I had several Cacti. I would need to develop some kind of frontend to try to make this distribution kind of transparent to the end user. But this solution is obviously not optimal.
Are you still looking for comments to this RFC or have the technical choices already been taken ?
Is this currently possible, and if so could someone point me in the right direction?
Any help is appreciated.
Cheers
Cameron.
Who is online
Users browsing this forum: No registered users and 4 guests