Keep info for longer without loosing existing data?
Moderators: Developers, Moderators
Keep info for longer without loosing existing data?
Hi Everyone,
Been using Cacti for around a year now, with quite a few customised pollers for a lot of our equipment here (gas pressures, temperatures, voltages, flow rates etc.). Quite a lot of our pollers are on 30 second polls due to the type of data we need, and we've always used the standard RRAs for displaying the various time periods.
However, something recently came up where we needed to see what a sensor had been recording several months ago, and the data isn't there. Due to the increased polling frequency of some of our pollers we only have 3 months of data for many of them (when viewing the 'Yearly' graphs.)
So my question is two fold - firstly, how do we go about getting cacti to store data for longer (say 2 years) and secondly, how do we 'update' the existing installation to store data for longer periods of time without loosing the data we currently have?
Any advice is much appreciated!
Cheers,
Been using Cacti for around a year now, with quite a few customised pollers for a lot of our equipment here (gas pressures, temperatures, voltages, flow rates etc.). Quite a lot of our pollers are on 30 second polls due to the type of data we need, and we've always used the standard RRAs for displaying the various time periods.
However, something recently came up where we needed to see what a sensor had been recording several months ago, and the data isn't there. Due to the increased polling frequency of some of our pollers we only have 3 months of data for many of them (when viewing the 'Yearly' graphs.)
So my question is two fold - firstly, how do we go about getting cacti to store data for longer (say 2 years) and secondly, how do we 'update' the existing installation to store data for longer periods of time without loosing the data we currently have?
Any advice is much appreciated!
Cheers,
Last edited by squeak on Mon Nov 29, 2010 10:48 am, edited 1 time in total.
Re: Keep info for longer without loosing existing data?
this is due to rrdtool consolidation, http://docs.cacti.net/manual:087 scroll to the bottom for more info.
| Scripts: Monitor processes | RFC1213 MIB | DOCSIS Stats | Dell PowerEdge | Speedfan | APC UPS | DOCSIS CMTS | 3ware | Motorola Canopy |
| Guides: Windows Install | [HOWTO] Debug Windows NTFS permission problems |
| Tools: Windows All-in-one Installer |
Re: Keep info for longer without loosing existing data?
Hi there,
Thanks very much for the info - unfortunately my experience with Cacti so far has been entirely on the custom pollers side and i know little (or nothing) about the rrd/rra aspects of it. I've had a read through that section and i can see that i need to create an RRA with a larger row and timespan, which i have now done. I attached it to the existing data template and i'm now getting a new graph being generated which is showing me 2 years. Which is good!...
However...the data in the new graph is still dissapearing as time goes by?
i.e. currently the data goes back to 08/08/2010 @ 9pm but 2 hours ago the graph was showing data back to 6pm on the same day, so it is still clearing up data at the same rate.
The new RRA is as follows:
Steps: 300
Rows: 1152000
Timespan: 66100000
And a datasource debug shows it, for example:
The graph it produces is correctly showing me approx 2 years (give or take, i've just estimated the values for now), but as i mentioned above it is still clearing up data at the same rate.
Can you point me at what i'm not understanding please? clearly i'm doing something wrong but i'm not quite sure what?
cheers!
Thanks very much for the info - unfortunately my experience with Cacti so far has been entirely on the custom pollers side and i know little (or nothing) about the rrd/rra aspects of it. I've had a read through that section and i can see that i need to create an RRA with a larger row and timespan, which i have now done. I attached it to the existing data template and i'm now getting a new graph being generated which is showing me 2 years. Which is good!...
However...the data in the new graph is still dissapearing as time goes by?
i.e. currently the data goes back to 08/08/2010 @ 9pm but 2 hours ago the graph was showing data back to 6pm on the same day, so it is still clearing up data at the same rate.
The new RRA is as follows:
Steps: 300
Rows: 1152000
Timespan: 66100000
And a datasource debug shows it, for example:
Code: Select all
Data Source Debug
/usr/bin/rrdtool create \
/var/lib/cacti/rra/cdc-chiller01_snmp_oid_436.rrd \
--step 30 \
DS:snmp_oid:GAUGE:600:-1000:1000 \
RRA:AVERAGE:0.5:1:500 \
RRA:AVERAGE:0.5:1:600 \
RRA:AVERAGE:0.5:6:700 \
RRA:AVERAGE:0.5:24:775 \
RRA:AVERAGE:0.5:288:115200 \
RRA:AVERAGE:0.5:300:1152000 \
RRA:MAX:0.5:1:500 \
RRA:MAX:0.5:1:600 \
RRA:MAX:0.5:6:700 \
RRA:MAX:0.5:24:775 \
RRA:MAX:0.5:288:115200 \
RRA:MAX:0.5:300:1152000 \
Can you point me at what i'm not understanding please? clearly i'm doing something wrong but i'm not quite sure what?
cheers!
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Keep info for longer without loosing existing data?
In the very same documentation section, you will find a script to automatically resize existing rrd files to your needs
R.
R.
Re: Keep info for longer without loosing existing data?
Hi There,
Thanks very much for that, and sorry for missing it in the first place!
Right then - As i test i added some rows to the first [0] section of an rrd, as per the instructions, and it now shows as follows:
Which is exactly what i was expecting - i think this will now keep 8600 datapoints under the first RRA? (not my intention in the long run, just a test to see if this works)
Next, I updated the RRD target for the graph which the original RRD was attached to, and generated the graph.
Unfortunately, the graph no longer shows 'previous' data, but it is recording new data stating from now.
So, i guess i'm still missing something?
Thanks very much for that, and sorry for missing it in the first place!
Right then - As i test i added some rows to the first [0] section of an rrd, as per the instructions, and it now shows as follows:
Code: Select all
ds[traffic_in].type = "COUNTER"
ds[traffic_out].type = "COUNTER"
rra[0].cf = "AVERAGE"
rra[0].rows = 8600 <- Increased from 600
rra[1].cf = "AVERAGE"
rra[1].rows = 700
rra[2].cf = "AVERAGE"
rra[2].rows = 775
rra[3].cf = "AVERAGE"
rra[3].rows = 797
rra[4].cf = "MAX"
rra[4].rows = 8600 <- Increased from 600
rra[5].cf = "MAX"
rra[5].rows = 700
rra[6].cf = "MAX"
rra[6].rows = 775
rra[7].cf = "MAX"
rra[7].rows = 797
Next, I updated the RRD target for the graph which the original RRD was attached to, and generated the graph.
Unfortunately, the graph no longer shows 'previous' data, but it is recording new data stating from now.
So, i guess i'm still missing something?
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Keep info for longer without loosing existing data?
Yessqueak wrote:Which is exactly what i was expecting - i think this will now keep 8600 datapoints under the first RRA? (not my intention in the long run, just a test to see if this works)
What exactly did you do when "updated the RRD target for the graph"?Next, I updated the RRD target for the graph which the original RRD was attached to, and generated the graph.
Unfortunately, the graph no longer shows 'previous' data, but it is recording new data stating from now.
R.
Re: Keep info for longer without loosing existing data?
Hi,
The particular RRD records a single SNMP value (water temperature in this case) so for ease i just, in order:
1) Duplicated an existing datasource (console->datasource->duplicate)
2) Edited the 'new' datasource and replaced the RRD it generated with my newly 'modified' one (edited the Data Source Path and pointed it to the modified RRD)
3) Edited an existing graph to reference the new data source (graph management -> Data Source 1 -> Selected datasource from step 1)
Since it didn't work i'm obviously doing something wrong
Thanks for your help!!
The particular RRD records a single SNMP value (water temperature in this case) so for ease i just, in order:
1) Duplicated an existing datasource (console->datasource->duplicate)
2) Edited the 'new' datasource and replaced the RRD it generated with my newly 'modified' one (edited the Data Source Path and pointed it to the modified RRD)
3) Edited an existing graph to reference the new data source (graph management -> Data Source 1 -> Selected datasource from step 1)
Since it didn't work i'm obviously doing something wrong
Thanks for your help!!
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Keep info for longer without loosing existing data?
Why? As you said, the rrd was modified. So the "old" data source still point to the now modified rrd. Why did you duplicate the data source, then? Do you assume, that both (quite identical) data sources are polled now?squeak wrote:2) Edited the 'new' datasource and replaced the RRD it generated with my newly 'modified' one (edited the Data Source Path and pointed it to the modified RRD)
In fact, the script was written with the intention to do an in-place-resize.
R.
Re: Keep info for longer without loosing existing data?
Hi,
The RRD i modified was done as a copy of the live RRD (incase i screwed it up and lost data) so i needed a copy of the datasource to point to the new RRD as a test. The idea was that i touched nothing of the original setup and if i destroyed the RRD or the datasource then nothing was lost.
So, are you saying that by doing this with a copy of an existing datasource (and a copy of an RRD) i am causing the historic data to go missing?
Cheers!
The RRD i modified was done as a copy of the live RRD (incase i screwed it up and lost data) so i needed a copy of the datasource to point to the new RRD as a test. The idea was that i touched nothing of the original setup and if i destroyed the RRD or the datasource then nothing was lost.
So, are you saying that by doing this with a copy of an existing datasource (and a copy of an RRD) i am causing the historic data to go missing?
Cheers!
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Keep info for longer without loosing existing data?
No. In case you copied the rrd file and the data source/graph info, your nearly there. What is still missing is making the poller aware of the second rrd file. Your method does not touch the poller commands, that's why the second rrd won't get updated
R.
R.
Re: Keep info for longer without loosing existing data?
Hi, ok thats odd because it is indeed being updated, it is just that the historic data goes missing and the graphs produced from the RRD just start from the moment i make the change to them.
Can you confirm then that by running that resize script on an existing RRD i should not be loosing any of the data previously contained within it?
Cheers!
Can you confirm then that by running that resize script on an existing RRD i should not be loosing any of the data previously contained within it?
Cheers!
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Keep info for longer without loosing existing data?
I can confirm, as I already have run that on several thousands of rrd files. But you will of course have to make sure that space is available. Nevertheless, try it on a few rrd files first to get used to it.squeak wrote:Hi, ok thats odd because it is indeed being updated, it is just that the historic data goes missing and the graphs produced from the RRD just start from the moment i make the change to them.
Can you confirm then that by running that resize script on an existing RRD i should not be loosing any of the data previously contained within it?
Cheers!
R.
Re: Keep info for longer without loosing existing data?
Hi,
Thanks for that, i've just given it a go with a couple of live files and will have to wait until tomorrow to see if the oldest data goes missing when it does the daily RRA. On the plus side, the existing data going back 6 months has remained in place and it is still graphing and updating correctly, which is cool!
I have one final question which you might be able to assist with. I'm trying to understand the relationship between selecting "Associated RRAs" within a Data Template, and the effect (if any) on an existing RRD file if you change which RRAs are associated with a data template AFTER it has been created.
So for example, if i create a data template, and select all 5 of the existing RRA's i currently have. Then in the future i create a new RRA, and i then select it as an additional "Associated RRA" within an existing data template. What happens? I can see that it now generates a new graph based on the timespan specified within the new RRA, but does it actually have any effect on the RRD file itself? Most importantly, will it cause any problems with the rolling-up of the data?
I've read the relevant manual sections several times over now and i'm just not quite 'getting' how this relationship works - sorry!
Thanks again,
Thanks for that, i've just given it a go with a couple of live files and will have to wait until tomorrow to see if the oldest data goes missing when it does the daily RRA. On the plus side, the existing data going back 6 months has remained in place and it is still graphing and updating correctly, which is cool!
I have one final question which you might be able to assist with. I'm trying to understand the relationship between selecting "Associated RRAs" within a Data Template, and the effect (if any) on an existing RRD file if you change which RRAs are associated with a data template AFTER it has been created.
So for example, if i create a data template, and select all 5 of the existing RRA's i currently have. Then in the future i create a new RRA, and i then select it as an additional "Associated RRA" within an existing data template. What happens? I can see that it now generates a new graph based on the timespan specified within the new RRA, but does it actually have any effect on the RRD file itself? Most importantly, will it cause any problems with the rolling-up of the data?
I've read the relevant manual sections several times over now and i'm just not quite 'getting' how this relationship works - sorry!
Thanks again,
- gandalf
- Developer
- Posts: 22383
- Joined: Thu Dec 02, 2004 2:46 am
- Location: Muenster, Germany
- Contact:
Re: Keep info for longer without loosing existing data?
nothing will happen with existing files. This is partly due to missing support from rrdtool utilities and partly "not yet implemented".squeak wrote:I have one final question which you might be able to assist with. I'm trying to understand the relationship between selecting "Associated RRAs" within a Data Template, and the effect (if any) on an existing RRD file if you change which RRAs are associated with a data template AFTER it has been created.
It will have no effect at all (with the exception for graphing an 5th timespan) as you already found outSo for example, if i create a data template, and select all 5 of the existing RRA's i currently have. Then in the future i create a new RRA, and i then select it as an additional "Associated RRA" within an existing data template. What happens? I can see that it now generates a new graph based on the timespan specified within the new RRA, but does it actually have any effect on the RRD file itself?
It will not cause "problems". The rrd file simply stays at it is today.Most importantly, will it cause any problems with the rolling-up of the data?
R
Re: Keep info for longer without loosing existing data?
Hi,
Ok, i've hit a bit of a problem with growing some of the files. I get a very odd output after growing them, to try and assist you in assisting me here is everything i have done:
rrdtool info output of original file, working fine and before any changes:
After running a 'grow' command as follows :
# perl /var/lib/resize.pl -f <sourcefile> -r 3 -o <outputfile> -g 1000
Graphs from before and after are attached - hopefully you can see the odd behaviour!
I have scaled the before graph to match the after graph so that you can see a direct comparison. The 'after' graph is showing ALL the data.
Oh, and just to clarify, the only data contained within this file is the orange line, the other 4 are sourced from other RRD files which i have not modified, the just happen to be rendered on the same graph.
Finally, just to clarify what i'm actually trying to do (in case i'm going about it completely the wrong way) - all i want to do is take the 1 day average (lowest resolution) and extend the period which it is kept so that the rrd stops dropping data from the 'back' of time. Currently it is clearing up data from mid september as you can see, so i just want that data to not be erased as time goes on.
Hopefully you can tell me what i'm doing wrong!
Thanks SO MUCH for your help, it is genuinely appreciated!!
Ok, i've hit a bit of a problem with growing some of the files. I get a very odd output after growing them, to try and assist you in assisting me here is everything i have done:
rrdtool info output of original file, working fine and before any changes:
Code: Select all
rrd_version = "0003"
step = 30
last_update = 1292001483
ds[snmp_oid].type = "GAUGE"
ds[snmp_oid].minimal_heartbeat = 600
ds[snmp_oid].min = -1.0000000000e+03
ds[snmp_oid].max = 1.0000000000e+03
ds[snmp_oid].last_ds = "60"
ds[snmp_oid].value = 1.8000000000e+02
ds[snmp_oid].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].cur_row = 382
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 700
rra[1].cur_row = 272
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 0.0000000000e+00
rra[1].cdp_prep[0].unknown_datapoints = 0
rra[2].cf = "AVERAGE"
rra[2].rows = 775
rra[2].cur_row = 543
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 7.0500000000e+02
rra[2].cdp_prep[0].unknown_datapoints = 0
rra[3].cf = "AVERAGE"
rra[3].rows = 797
rra[3].cur_row = 399
rra[3].pdp_per_row = 288
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 3.5552000000e+03
rra[3].cdp_prep[0].unknown_datapoints = 0
rra[4].cf = "MAX"
rra[4].rows = 600
rra[4].cur_row = 574
rra[4].pdp_per_row = 1
rra[4].xff = 5.0000000000e-01
rra[4].cdp_prep[0].value = NaN
rra[4].cdp_prep[0].unknown_datapoints = 0
rra[5].cf = "MAX"
rra[5].rows = 700
rra[5].cur_row = 33
rra[5].pdp_per_row = 6
rra[5].xff = 5.0000000000e-01
rra[5].cdp_prep[0].value = 6.0000000000e+01
rra[5].cdp_prep[0].unknown_datapoints = 0
rra[6].cf = "MAX"
rra[6].rows = 775
rra[6].cur_row = 481
rra[6].pdp_per_row = 24
rra[6].xff = 5.0000000000e-01
rra[6].cdp_prep[0].value = 6.0000000000e+01
rra[6].cdp_prep[0].unknown_datapoints = 0
rra[7].cf = "MAX"
rra[7].rows = 797
rra[7].cur_row = 157
rra[7].pdp_per_row = 288
rra[7].xff = 5.0000000000e-01
rra[7].cdp_prep[0].value = 6.5000000000e+01
rra[7].cdp_prep[0].unknown_datapoints = 0
# perl /var/lib/resize.pl -f <sourcefile> -r 3 -o <outputfile> -g 1000
Code: Select all
rrd_version = "0003"
step = 30
last_update = 1292001483
ds[snmp_oid].type = "GAUGE"
ds[snmp_oid].minimal_heartbeat = 600
ds[snmp_oid].min = -1.0000000000e+03
ds[snmp_oid].max = 1.0000000000e+03
ds[snmp_oid].last_ds = "60"
ds[snmp_oid].value = 1.8000000000e+02
ds[snmp_oid].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].cur_row = 382
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 700
rra[1].cur_row = 272
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 0.0000000000e+00
rra[1].cdp_prep[0].unknown_datapoints = 0
rra[2].cf = "AVERAGE"
rra[2].rows = 775
rra[2].cur_row = 543
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 7.0500000000e+02
rra[2].cdp_prep[0].unknown_datapoints = 0
rra[3].cf = "AVERAGE"
rra[3].rows = 1797 <--- has 'grown' correctly
rra[3].cur_row = 399
rra[3].pdp_per_row = 288
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 3.5552000000e+03
rra[3].cdp_prep[0].unknown_datapoints = 0
rra[4].cf = "MAX"
rra[4].rows = 600
rra[4].cur_row = 574
rra[4].pdp_per_row = 1
rra[4].xff = 5.0000000000e-01
rra[4].cdp_prep[0].value = NaN
rra[4].cdp_prep[0].unknown_datapoints = 0
rra[5].cf = "MAX"
rra[5].rows = 700
rra[5].cur_row = 33
rra[5].pdp_per_row = 6
rra[5].xff = 5.0000000000e-01
rra[5].cdp_prep[0].value = 6.0000000000e+01
rra[5].cdp_prep[0].unknown_datapoints = 0
rra[6].cf = "MAX"
rra[6].rows = 775
rra[6].cur_row = 481
rra[6].pdp_per_row = 24
rra[6].xff = 5.0000000000e-01
rra[6].cdp_prep[0].value = 6.0000000000e+01
rra[6].cdp_prep[0].unknown_datapoints = 0
rra[7].cf = "MAX"
rra[7].rows = 797
rra[7].cur_row = 157
rra[7].pdp_per_row = 288
rra[7].xff = 5.0000000000e-01
rra[7].cdp_prep[0].value = 6.5000000000e+01
rra[7].cdp_prep[0].unknown_datapoints = 0
I have scaled the before graph to match the after graph so that you can see a direct comparison. The 'after' graph is showing ALL the data.
Oh, and just to clarify, the only data contained within this file is the orange line, the other 4 are sourced from other RRD files which i have not modified, the just happen to be rendered on the same graph.
Finally, just to clarify what i'm actually trying to do (in case i'm going about it completely the wrong way) - all i want to do is take the 1 day average (lowest resolution) and extend the period which it is kept so that the rrd stops dropping data from the 'back' of time. Currently it is clearing up data from mid september as you can see, so i just want that data to not be erased as time goes on.
Hopefully you can tell me what i'm doing wrong!
Thanks SO MUCH for your help, it is genuinely appreciated!!
- Attachments
-
- 02-after.jpg (124.35 KiB) Viewed 1406 times
-
- 01-before.jpg (117.69 KiB) Viewed 1406 times
Who is online
Users browsing this forum: No registered users and 5 guests