On Wed, Nov 13, 2019 at 11:33 AM Robert Haas <robertmh...@gmail.com> wrote: > On Tue, Nov 12, 2019 at 6:22 PM Peter Geoghegan <p...@bowt.ie> wrote: > > * Disabled deduplication in system catalog indexes by deeming it > > generally unsafe. > > I (continue to) think that deduplication is a terrible name, because > you're not getting rid of the duplicates. You are using a compressed > representation of the duplicates.
"Deduplication" never means that you get rid of duplicates. According to Wikipedia's deduplication article: "Whereas compression algorithms identify redundant data inside individual files and encodes this redundant data more efficiently, the intent of deduplication is to inspect large volumes of data and identify large sections – such as entire files or large sections of files – that are identical, and replace them with a shared copy". This seemed like it fit what this patch does. We're concerned with a specific, simple kind of redundancy. Also: * From the user's point of view, we're merging together what they'd call duplicates. They don't really think of the heap TID as part of the key. * The term "compression" suggests a decompression penalty when reading, which is not the case here. * The term "compression" confuses the feature added by the patch with TOAST compression. Now we may have two very different varieties of compression in the same index. Can you suggest an alternative? -- Peter Geoghegan