On Fri, Aug 4, 2023 at 4:34 PM Tomas Vondra <tomas.von...@enterprisedb.com> wrote: > Thanks for the clarification, I think I understand better now.
Honestly, it's gratifying to be understood at all in a discussion like this one. Figuring out how to articulate some of my more subtle ideas (without leaving out important nuances) can be a real struggle for me. > Let me briefly sum my understanding for the two patches: > > - The SAOP patch eliminates those heap accesses because it manages to > evaluate all clauses in the AM, including clauses that would previously > be treated as "table filters" and evaluated on the heap row. Yes, exactly. > - My patch achieves a similar result by evaluating the clauses as index > filters (i.e. on the index tuple). That's probably not as good as proper > access predicates, so it can't help with the index page accesses, but > better than what we had before. Yes, exactly. > No, the extra accesses were not because of VM buffer hits - it was > because of having to actually fetch the heap tuple for pages that are > not fully visible, which is what happens right after the insert. Yes, exactly. > The patch does what we index-only scans do - before evaluating the > filters on an index tuple, we check if the page is fully visible. If > not, we fetch the heap tuple and evaluate the filters on it. > > This means even an index-only scan would behave like this too. And it > goes away as the table gets vacuumed, at which point we can eliminate > the rows using only the index tuple again. Yes, exactly. > Right. This however begs a question - why would we actually need to > check the visibility map before evaluating the index filter, when the > index tuple alone is clearly good enough for the bitmapOr plan? > > Because if we didn't need to do that VM check, this issue with extra > heap accesses would disappear. The index AM is entitled to make certain assumptions of opclass members -- assumptions that cannot be made during expression evaluation. The classic example is division-by-zero during evaluation of a qual, for a tuple that wasn't visible anyway. Our assumption is that stuff like that just cannot happen with index quals -- users shouldn't ever encounter sanity check errors caused by invisible-to-their-MVCC-snapshot tuples. I think that that's the main difficulty, as far as avoiding heap access for index filters is concerned. Of course, you could just limit yourself to those cases where the index AM assumptions were safe. But at that point, why not simply make sure to generate true index quals, and be done with it? > I copied this from the IOS somewhat blindly, but now I'm starting to > think it was misguided. I thought it's a protection against processing > "invalid" tuples - not tuples broken after a crash (as that would be > fixed by crash recovery), but e.g. tuples with schema changes from an > aborted transaction. I agree that schema changes for indexes shouldn't be an issue, though. > I'm not quite sure what are the differences between "index qual" vs. > "access predicate index qual" vs. "index filter predicate index quals", > or what "dispacing" would mean exactly. For the purposes of this point about "a hierarchy of quals", it doesn't really matter -- that was the point I was trying to make. In other words, "index quals" are strictly better than equivalent "index filters", which are themselves strictly better than equivalent "table filters". While it is also true that you can meaningfully classify "index quals" into their own hierarchy (namely access predicates vs index filter predicates), that doesn't necessarily need to be discussed when discussing the hierarchy from a planner point of view, since it is (at least for now) internal to the nbtree index AM. On second thought, I tend to doubt that your patch needs to worry about each type of index qual directly. It probably needs to worry about index quals in general. It is always better to make what could be an "index filter" into an index qual. Of course, the current practical problem for you is figuring out how to deal with that in cases like the BitmapOr case. Since it's not as if the planner is wrong, exactly...it really is the cheapest plan, so the planner is at least right on its own terms. I am interested in finding a solution to that problem. > FWIW this also reminds me that this whole discussion mostly focused on > SAOP clauses (and how they may be translated into access predicates > etc.). My patch is however way more general - it applies to all clauses, > not just SAOP ones, including clauses with no chance of evaluating at > the AM level. I agree that your patch is way more general, in one way. But the SAOP patch is also much more general than you might think. For one thing the whole BitmapOr plan issue actually required that you compared your patch to a combination of my SAOP patch and Alena Rybakina's OR-to-SAOP transformation patch -- you needed both patches. Her patch effectively made my own patch much more general. But there are all kinds of other transformations that might further extend the applicability of nbtree executor changes from my patch -- the MDAM techniques are certainly not limited to ORs/SAOPs. For example, it's easy to imagine inventing a new type of SAOP-style clause that represented "BETWEEN x AND y" expressions, that would make range predicates into "first class indexable operators" -- ScalarRangeOprExpr, or something. With multi-column indexes, these ScalarRangeOprExpr clauses could be composed beside ScalarArrayOpExpr clauses, as well as simpler clauses -- all while preserving index order on output. So there are quite a few plan shapes that aren't possible at all right now, that become possible, many of which don't even have SAOPs/ORs. Of course it won't ever be possible to create a transformation that doesn't ultimately flatten everything into MDAM style "single value" DNF predicates, which have to use simple B-Tree opclass operators -- obviously there are fundamental limits to it. So even in a perfect world, with every possible MDAM-ish transformation implemented, we'll still have a significant need for your patch. -- Peter Geoghegan