So i have been doing some experieiments with InfluxDB3 recently just to test out if there are some good reasons to move to it. I keep finding what i think are some pro's.
I have a fully functional instance of InfluxDB3 up and working side by side my v2 instance. It is being populated with the MQTT Exporter from Hubitat. One of the interesting thing was to see how much raw data i have in in it. I have a few tables with over 20 milion records and many between 1 million and 20.
I am starting to look at tryng to find ways to reduce that number and i am curious if anyone else has tried to approach this concern. As of right now with influxdb3 i am testing out one of the provided plugins that is a downsampler. Simply put it will take a measurement, analyze the vlaues for given intervals and then ouput some basic numbers to proided representational for that measurement. Think calculated/derived values like average, min, and Max for a given interval. This can consolidate the values considerably. That change along with a retention of a short period of time can keep the retained data small.This could be good for data that is llong term retention that i don't need to worry about getting into the fine details of what happened that day.
The second option i am looking at is using SQL to go in and remove duplicate records were the value hasn't changed. Think something like a contact sensor that has been closed for months, but has a record every x min. that can eliminate a significant number of unneeded records potentially. I know influxdb logger has some logic to prevent records from being posted that haven't chnaged, but before @dennypage took it over i am sure the version i used didn't do that, and now that i am experimenting with MQTT Exporter i don't think it does that either. It seems to post updated values for every attribute of a device if anything changes on that device.
I would imagine this is a issue that hits everyone that retains data for long term visualization. I am just curious how others are handeling this even if they don't use InfluxDB.
