I just finished looking through the various threads about index AM changes that Bruce has been holding onto in the commit-fest queue. There seem to be the following issues:
* Proposed change to have amgetmulti return its results by ORing bits into a caller-supplied TIDBitmap, rather than the current interface involving an intermediate TID array. The TID-array design was intended to be agnostic about what would happen on either side of the API, but it seems that it's really just equally inconvenient for everybody :-(. Although the original motivation for this was to open a door for bitmap indexes, it seems to me it's worth doing whether or not bitmap indexes ever happen. It's logically simpler (only one amgetmulti call per scan) and should save at least a few cycles. I recommend we just go ahead with this. * Proposed change to let both amgetnext and amgetmulti mark returned tuples as "candidate matches", that is in need of rechecking of quals against the real heap tuple. I had originally objected to this on the grounds that it would require setup work that doesn't happen now, but looking at the code I see that's wrong --- the setup work *does* happen anyway, see indexqualorig. nodeIndexscan.c comments "The indexqualorig expression is always initialized even though it will only be used in some uncommon cases --- would be nice to improve that". So this complaint is probably a red herring. I'm still a bit concerned by the fact that the patch only tracks this on a page basis in the amgetmulti case: if any of the tuples on a page are in-doubt then everything on the page will have to be rechecked. Is that good enough? A bigger issue is whether this is worth applying when nobody seems to be working on either of the main uses for it (bitmap indexes and GIT indexes). There seemed to be some possible marginal use for it in GIST indexes, but I'm not convinced that's a sufficient reason to complicate the APIs. * Proposed change to let amgetnext mark returned tuples as not necessarily in order, requiring somebody downstream to sort them again. I was pretty desperately unhappy with that because it required injection of sorting knowledge into code that really shouldn't have anything to do with it --- not least because not all indexscans even have a defined ordering. Given that it is only needed for one particular possible implementation of GIT, and Heikki was leaning away from that implementation anyway at last report, I vote against considering this any further. * Proposed changes to refactor the TIDBitmap support to include a concept of a "stream bitmap" being read from disk. (Not strictly an index AM change, but closely related.) While this is clean and nice on its own, it doesn't seem to have any use unless bitmap indexes happen. If we leave the code sit it'll probably acquire bit rot, but I'm disinclined to add a bunch of unused code, too. Thoughts? * Proposed changes to allow amgetnext to return the actual index keys, allowing certain types of "unindexable" quals to be checked without having to fetch the heap entry. For example a LIKE condition could be fully checked against the index entry. Since certain types of indexes (GIN now, possibly hash in future) are incapable of doing this, there'd need to be a way of marking index AMs (or perhaps individual indexes) as able or not able to return their keys, so that the planner could know whether quals could be pushed into the indexscan. We don't have any actual submitted patch for this, but AFAIR everyone agrees it's a good idea. * Bitmap indexes themselves. As far as I can tell, this development effort has stalled because of intractable problems with VACUUM. Can anyone provide a status update on that? * GIT (Grouped Index Tuple) indexes, which achieve index space savings in btrees by having a single index tuple represent multiple heap tuples (on a single heap page) containing a range of key values. I am not sure what the development status is --- Heikki had submitted a completed patch but there seemed to be agreement on making changes, and that's not been done AFAIK. The really serious problem I've got with it is that it'd foreclose the possibility of returning actual index keys from btree indexes, thus basically killing the usefulness of that idea. I'm not convinced it would offer enough gain to be worth paying that price. Another issue is that we'd need to check how much of the use-case for GIT has been taken over by HOT. * "Thick" indexes containing copies of tuple visibility info. As far as I'm concerned, this patch isn't going anywhere :-(. It's got the same showstopper problem as retail vacuum and some of the earlier forms of the HOT patch: it assumes that during tuple update or delete, it can re-find the tuple's index entries by re-computing the indexed expressions. That's just not reliable enough. Is that a reasonable summary of where we are? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers