Hi All,
I have a system which is under extreme load, with around 35,000 data sources. The polling cycle is 5 minutes. In the 5 minute cycle, the first 1-2 minutes are gathering data, and the remaining time is used flushing the contents of the poller_output table to the individual RRD files.
If the system is not busy, this action can take around 3 minutes to complete. I have watched the table empty itself, and around 3 minutes into the cycle the count of rows is 0. However if there are many users on it, we run into IO problems and it will take 6 or 7 minutes to move the data from the poller_output table to the RRD files. Truss shows that the rrdtool is just waiting for disk. When the next polling cycle starts, the contents of the poller cache table is dumped. This has lead to holes in graphs during business hours.
Long-term a new system is in order for this, however for the accute problem: What are the consequences of commenting out the line:
db_execute("TRUNCATE TABLE poller_output");
from the poller.php script?
I see the table contains the data source, date/time and value. I assume when adding data to the RRD, the time can be defined (so if an update happens to arrive 10 minutes late into the RRD, it can be set that this was the value 10 minutes ago). Of course if the system is under load the graphs won't have data in them from the last 1 or 2 polling cycles, but when everyone goes for coffee and the system load decreases, it can fill in the holes.
Am I understanding this workaround correctly, or is this going to break something?
Cacti version 0.8.7e running on Solaris. I have found a number of posts in the forum that reference the php memory limit. I have doubled this with no effect on the problem.
Thanks
Not flushing the poller output table on poller start
Moderators: Developers, Moderators
Re: Not flushing the poller output table on poller start
I just wanted to post a follow-up in case anyone else has the idea to try this.
It works, however there are some risks. After removing the line from the poller, old data is now appearing in the graphs with a delay, and most of the holes in the graphs have disappeared (currently 36 hours running since removing). However, we are getting around 600 data points per day which cannot be added to the graph, and they are just sitting in the poller output table until they are manually cleared.
I believe the problem here is, the RRDs have an "old" update and a "new" update in the database waiting to be inserted into the RRD file. If the new update is inserted first, then the RRD has a "NaN" for the old value. When we try to insert the old value, this cannot be done therefore it sits in the database until it is removed. If however, the old update is processed first, it is inserted correctly, and then when the new update is processed it also completes successfully.
Right now this is working for us, and I will continue to monitor it closely and hope for delivery of new hardware
It works, however there are some risks. After removing the line from the poller, old data is now appearing in the graphs with a delay, and most of the holes in the graphs have disappeared (currently 36 hours running since removing). However, we are getting around 600 data points per day which cannot be added to the graph, and they are just sitting in the poller output table until they are manually cleared.
I believe the problem here is, the RRDs have an "old" update and a "new" update in the database waiting to be inserted into the RRD file. If the new update is inserted first, then the RRD has a "NaN" for the old value. When we try to insert the old value, this cannot be done therefore it sits in the database until it is removed. If however, the old update is processed first, it is inserted correctly, and then when the new update is processed it also completes successfully.
Right now this is working for us, and I will continue to monitor it closely and hope for delivery of new hardware
Re: Not flushing the poller output table on poller start
What resource on the system is causing the bottleneck?
Is this something that the boost plugin could help with?
Is this something that the boost plugin could help with?
Who is online
Users browsing this forum: No registered users and 3 guests