I wrote a prototype quality patch that implements what I call "classic" suffix truncation. That is, the patch makes nbtree leaf page splits generate a short prefix datum for the new high key for variable-length datatypes, generated using new opclass infrastructure. This is useful for string-like datatypes such as text, especially when we happen to be indexing long text strings, where we don't truly need to use long strings in internal pages. The feature is "classic" in the sense that it works in more or less the way that Bayer and Unterauer anticipate in the "Prefix B-Trees" paper. We already have a limited form of suffix truncation that only works at the whole-column granularity. To some degree, I approached writing this patch as finishing off the work that was started in Postgres 12. It started out that way, at least.
I believe that the prototype is representative of the benefits that we'd get from a high quality implementation of the same feature. Those benefits turn out to be relatively small, which is discouraging. I'm not going to pursue the project any further in the near-term, because there is more important work to be done, but I thought that I'd share what I found here. Highlights: * There were changes required to nbtsplitloc.c to make it work well, unsurprisingly. It has to be sensitive to the final size of the new high key, rather than just caring about the number of columns that can be truncated. At the same time, it's important to not break the duplicate handling stuff. I ended up changing the penalty logic for leaf splits, making it give primary consideration to the discrete cost that is the total number of columns that are left in the new high key, and secondary consideration to the continuous cost that is the exact final size of the new high key with classic suffix truncation. This seemed to work well, though I didn't get as far as worrying about the extra cycles (I only implemented classic suffix truncation for "C" locale text). * I saw a tiny reduction in the number of leaf pages with affected indexes in my test suite, and small reductions in the number of internal pages. * There were 2 or 3 existing indexes from my test suite that happened to index very large strings, such as an index on a "notes" column. These indexes had massive reductions in the number of internal pages (e.g. a 5x+ reduction), along with a very small reduction in the number of leaf pages. Users that happen to have a lot of indexes that look like this are likely to find classic suffix truncation compelling, but that doesn't seem like a good enough reason to push ahead with the patch. -- Peter Geoghegan