Hi Bruce, On Mon, May 4, 2020 at 8:16 PM Bruce Momjian <br...@momjian.us> wrote: > I have committed the first draft of the PG 13 release notes. You can > see them here: > > https://momjian.us/pgsql_docs/release-13.html
I see that you have an entry for the deduplication feature: "More efficiently store duplicates in btree indexes (Anastasia Lubennikova, Peter Geoghegan)" I would like to provide some input on this. Fortunately it's much easier to explain than the B-Tree work that went into Postgres 12. I think that you should point out that deduplication works by storing the duplicates in the obvious way: Only storing the key once per distinct value (or once per distinct combination of values in the case of multi-column indexes), followed by an array of TIDs (i.e. a posting list). Each TID points to a separate row in the table. It won't be uncommon for this to make indexes as much as 3x smaller (it depends on a number of different factors that you can probably guess). I wrote a summary of how it works for power users in the B-Tree documentation chapter, which you might want to link to in the release notes: https://www.postgresql.org/docs/devel/btree-implementation.html#BTREE-DEDUPLICATION Users that pg_upgrade will have to REINDEX to actually use the feature, regardless of which version they've upgraded from. There are also some limited caveats about the data types that can use deduplication, and stuff like that -- see the documentation section I linked to. Finally, you might want to note that the feature is enabled by default, and can be disabled by setting the "deduplicate_items" index storage option to "off". (We have yet to make a final decision on whether the feature should be enabled before the first stable release of Postgres 13, though -- I have an open item for that.) -- Peter Geoghegan