On Wed, Mar 6, 2019 at 10:15 PM Heikki Linnakangas <hlinn...@iki.fi> wrote: > I made a copy of the _bt_binsrch, _bt_binsrch_insert. It does the binary > search like _bt_binsrch does, but the bounds caching is only done in > _bt_binsrch_insert. Seems more clear to have separate functions for them > now, even though there's some duplication.
> /* > * Do the insertion. First move right to find the correct page to > * insert to, if necessary. If we're inserting to a non-unique index, > * _bt_search() already did this when it checked if a move to the > * right was required for leaf page. Insertion scankey's scantid > * would have been filled out at the time. On a unique index, the > * current buffer is the first buffer containing duplicates, however, > * so we may need to move right to the correct location for this > * tuple. > */ > if (checkingunique || itup_key->heapkeyspace) > _bt_findinsertpage(rel, &insertstate, stack, heapRel); > > newitemoff = _bt_binsrch_insert(rel, &insertstate); > The attached new version simplifies this, IMHO. The bounds and the > current buffer go together in the same struct, so it's easier to keep > track whether the bounds are valid or not. Now that you have a full understanding of how the negative infinity sentinel values work, and how page deletion's leaf page search and !heapkeyspace indexes need to be considered, I think that we should come back to this _bt_binsrch()/_bt_findsplitloc() stuff. My sense is that you now have a full understanding of all the subtleties of the patch, including those that that affect unique index insertion. That will make it much easier to talk about these unresolved questions. My current sense is that it isn't useful to store the current buffer alongside the binary search bounds/hint. It'll hardly ever need to be invalidated, because we'll hardly ever have to move right within _bt_findsplitloc() when doing unique index insertion (as I said before, the regression tests *never* have to do this according to gcov). We're talking about a very specific set of conditions here, so I'd like something that's lightweight and specialized. I agree that the savebinsrch/restorebinsrch fields are a bit ugly, though. I can't think of anything that's better offhand. Perhaps you can suggest something that is both lightweight, and an improvement on savebinsrch/restorebinsrch. I'm of the opinion that having a separate _bt_binsrch_insert() does not make anything clearer. Actually, I think that saving the bounds within the original _bt_binsrch() makes the design of that function clearer, not less clear. It's all quite confusing at the moment, given the rightmost/!leaf/page empty special cases. Seeing how the bounds are reused (or not reused) outside of _bt_binsrch() helps with that. The first 3 patches seem commitable now, but I think that it's important to be sure that I've addressed everything you raised satisfactorily before pushing. Or that everything works in a way that you can live with, at least. It would be great if you could take a look at the 'Add high key "continuescan" optimization' patch, which is the only one you haven't commented on so far (excluding the amcheck "relocate" patch, which is less important). I can put that one off for a while after the first 3 go in. I will also put off the "split after new item" commit for at least a week or two. I'm sure that the idea behind the "continuescan" patch will now seem pretty obvious to you -- it's just taking advantage of the high key when an index scan on the leaf level (which uses a search style scankey, not an insertion style scankey) looks like it may have to move to the next leaf page, but we'd like to avoid it where possible. Checking the high key there is now much more likely to result in the index scan not going to the next page, since we're more careful when considering a leaf split point these days. The high key often looks like the items on the page to the right, not the items on the same page. Thanks -- Peter Geoghegan