On Tue, May 4, 2021 at 11:52 AM Jeff Davis <pg...@j-davis.com> wrote: > On Mon, 2021-05-03 at 15:07 -0700, Peter Geoghegan wrote: > > It seems senseless to *require* table AMs to support something like a > > bitmap scan. > > I thought about this some more, and this framing is backwards. > ItemPointers are fundamental to the table AM API: they are passed in to > required methods, and expected to be returned[1].
I prefer my framing, but okay, let's go with yours. What difference does it make? The fact that we're starting with the table AM API doesn't change the fundamental fact that quite a few implementation details that are local to code like the GIN AM and tidbitmap.c were (rightly or wrongly) simply built with heapam in mind. The fact that that's true is hardly surprising, and hardly argues against the idea of having a table AM to begin with. There is no getting around the need to talk about the first principles here, and to talk about the specific implications for your particular table AM (perhaps others too). Abstractions are only useful when they serve concrete implementations. Of course they should be as general and abstract as possible -- but no more. > Bitmap scans are optional, but that should be determined by whether the > author wants to implement the bitmap scan methods of their table AM. > The fine details of ItemPointer representation should not be making the > decision for them. A distinction without a difference. If bitmap scans are optional and some index AMs are 100% built from the ground up to work only with bitmap scans, then those index AMs are effectively optional (or optional to the extent that bitmap scans themselves are optional). I have absolutely no idea how it would be possible to make GIN work without having index scans. It would be so different that it wouldn't be GIN anymore. I think maybe it is possible for GIN to work with your column store table AM in particular. Why aren't we talking about that concrete issue, or something like that? We're talking about this abstraction as if it must already be perfect, and therefore the standard by which every other thing needs to be measured. But why? > We still need to answer the core question that started this thread: > what the heck is an ItemPointer, anyway? > > After looking at itemptr.h, off.h, ginpostinglist.c and tidbitmap.c, it > seems that an ItemPointer is a block number from [0, 0xFFFFFFFe]; and > an offset number from [1, MaxHeapTuplesPerPage] which is by default [1, > 291]. > > Attached is a patch that clarifies what I've found so far and gives > clear guidance to table AM authors. Before I commit this I'll make sure > that following the guidance actually works for the columnar AM. I don't get what the point of this patch is. Obviously all of the particulars here are just accidents of history that we ought to change sooner or later anyway. I don't have any objection to writing them all down someplace official. But what difference does it make if there is no underlying *general* set of principles behind any of it? This definition of a TID can break at any time because it just isn't useful or general. This is self-evident -- your definition includes MaxHeapTuplesPerPage! How could that possibly be anything other than an accident whose details are completely arbitrary and therefore subject to change at any time? This is not necessarily a big deal! We can fix it by reconciling things in a pragmatic, bottom-up way. That's what I expected would happen all along. The table AM is not the Ark of the Covenant (just like tidbitmap.c, or anything else). -- Peter Geoghegan