I am working on a research project involving a datatype, D, with the
following characteristics:

- A value of D is a variable-length binary string.

- A value of D usually gives rise to a single index term, but there could
occasionally be more, (up to some configurable maximum).

- The index terms are int64.

GIN looks like a good fit for my requirements, and I've done a little
experimentation (using btree_gin) to determine that the optimizer is
working as I'd hoped.

So I'm going to proceed with GIN indexes, writing an extension for my
datatype. But I'm wondering about btrees and GIN, in general. This
discussion was pretty interesting:
https://www.postgresql-archive.org/Questions-about-btree-gin-vs-btree-gist-for-low-cardinality-columns-td6088041.html
.

My main questions are about the remaining differences between btree and GIN
indexes. With btree key deduplication in Postgres 12, one of the main
differences between btrees and GIN is gone, (my understanding is that GIN
has done this optimization for a long time).

I think that one other major difference between the two is that GIN can
handle multiple keys per row, while a btree cannot. Is there some
fundamental reason why the btree cannot accommodate multiple keys per row?
I think that this would have no impact on the btree structure itself. The
only difference would be that the same row (ctid) could be associated with
multiple keys.

I guess the top-level question is this: Is it possible in principle for the
btree index to subsume GIN index capabilities?

Jack Orenstein

Reply via email to