On Fri, Mar 18, 2022 at 1:35 AM Nathan Bossart <nathandboss...@gmail.com> wrote: > > > I guess I think we should be slightly more ambitious. One idea could be to > create a default_toast_compression_ratio GUC with a default of 0.95. This > means that, by default, a compressed attribute must be 0.95x or less of the > size of the uncompressed attribute to be stored compressed. Like > default_toast_compression, this could also be overridden at the column > level with something like
I am not sure that we want a GUC to control that but we can certainly be more ambitious. Basically, in the current patch if data is slightly large then we would always prefer to store the compressed data, e.g. if the data size is 200kB then even if the compression ratio is as low as 1% then we would choose to store then compressed data. I think we can make it based on the compression ratio and then upper bound it with the number of chunk differences. For example if the compression ratio < 10% then stored it uncompressed iff the chunk difference < threshold. But with that we might see performance impact on the smaller data which has a compressed ratio < 10% because their chunk difference will always be under the threshold. So maybe the chunk difference threshold can be a function based on the total numbers of chunks required for the data, maybe a logarithmic function so that the threshold grows slowly along with the base data size. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com