On Thu, May 1, 2025 at 10:44 PM Bruce Momjian <br...@momjian.us> wrote:
> I have committd the first draft of the PG 18 release notes.

I suggest that you use something like the following wording for the
skip scan feature:

Add the "skip scan" optimization, which enables more efficient scans
of multicolumn B-tree indexes for queries that omit an "=" condition
on one or more prefix index columns.

This is similar to the wording that appeared in the beta1 announcement.

The term "skip scan" has significant baggage -- we need to be careful
to not add to the confusion. There are naming conflicts, which seem
likely to confuse some users. Various community members have in the
past referred to a feature that MySQL calls loose index scan as skip
scan, which seems wrong to me -- it clashes with the naming
conventions used by other RDBMSs, for no good reason. Skip scan and
loose index scan are in fact rather different features.

For example, TimescaleDB offers Loose index scan as part of the
TimescaleDB Postgres extension, which (for whatever reason) they chose
to call skip scan:

https://www.timescale.com/blog/how-we-made-distinct-queries-up-to-8000x-faster-on-postgresql

Note that loose index scan can only be used with certain kinds of
queries involving DISTINCT or GROUP BY. Whereas skip scan (in Oracle
and now in Postgres) can work with any query that omits one or more
"=" conditions on a prefix index column from a multicolumn index (when
a later index column has some condition that can be used by the scan)
-- it doesn't have to involve aggregation. I believe that describing
the feature along these lines will make it less likely that users will
be confused by the apparent naming conflict.

FWIW, I don't think that it's important that the release notes point
out that skip scan is only helpful when the leading/skipped column is
low cardinality (though that detail is accurate).

-- 
Peter Geoghegan


Reply via email to