Re: Status of the table access method work

Andres Freund Mon, 08 Apr 2019 09:39:36 -0700

Hi

On 2019-04-08 14:53:53 +0300, Heikki Linnakangas wrote:
> On 05/04/2019 23:25, Andres Freund wrote:
> > - the (optional) bitmap heap scan API - that's fairly intrinsically
> >    block based. An AM could just internally subdivide TIDs in a different
> >    way, but I don't think a bitmap scan like we have would e.g. make a
> >    lot of sense for an index oriented table without any sort of stable
> >    tid.
> 
> If an AM doesn't implement the bitmap heap scan API, what happens? Bitmap
> scans are disabled?


Yea, the planner doesn't consider them. It just masks the index's
amhasgetbitmap.  Seems to be the most reasonable thing to do?


> Even if an AM isn't block-oriented, the bitmap heap scan API still makes
> sense as long as there's some correlation between TIDs and physical
> location.

Yea, it could be a non-linear mapping. But I'm honestly not sure how
many non-block oriented AMs with such a correlation there are - I mean
you're not going to have that in say an IOT. And it'd be trivial to just
"fake" a block mapping for an in-memory AM.


> The only really broken thing about that currently is the
> prefetching: nodeBitmapHeapScan.c calls PrefetchBuffer() directly with the
> TID's block numbers. It would be pretty straightforward to wrap that in a
> callback, so that the AM could do something different.

That, and the VM_ALL_VISIBLE() checks both in nodeBitmapHeapscan.c and
nodeIndexonlyscan.c.


> Or move even more of the logic to the AM, so that the AM would get the whole
> TIDBitmap in table_beginscan_bm(). It could then implement the fetching and
> prefetching as it sees fit.
> 
> I don't think it's urgent, though. We can cross that bridge when we get
> there, with the first AM that needs that flexibility.

Yea, it seemed nontrivial (not in really hard, just not obvious), and
the implicated code duplication scared me away.


> > The most constraining factor for storage, I think, is that currently the
> > API relies on ItemPointerData style TIDs in a number of places (i.e. a 6
> > byte tuple identifier).
> 
> I think 48 bits would be just about enough

I don't think that's really true. Consider e.g. implementing an index
oriented table - there's no way you can efficiently implement one with
that small a key. You basically need a helper index just to have
efficient and small enough tids.  And given that we're also going to
need wider tids for global indexes, I suspect we're just going to have
to bite into the sour apple and make tids variable width.


> , but it's even more limited than
> you might at the moment. There are a few places that assume that the
> offsetnumber <= MaxHeapTuplesPerPage. See ginpostinglist.c, and
> MAX_TUPLES_PER_PAGE in tidbitmap.c. Also, offsetnumber can't be 0, because
> that makes the ItemPointer invalid, which is inconvenient if you tried to
> use ItemPointer as just an arbitrary 48-bit integer.

Good point.

Thanks for looking (and playing, in the other thread)!

Greetings,

Andres Freund

Re: Status of the table access method work

Reply via email to