On 07/03/2019 14:54, Peter Geoghegan wrote:
On Wed, Mar 6, 2019 at 10:15 PM Heikki Linnakangas <hlinn...@iki.fi> wrote:
After staring at the first patch for bit longer, a few things started to
bother me:

* The new struct is called BTScanInsert, but it's used for searches,
too. It makes sense when you read the README, which explains the
difference between "search scan keys" and "insertion scan keys", but now
that we have a separate struct for this, perhaps we call insertion scan
keys with a less confusing name. I don't know what to suggest, though.
"Positioning key"?

I think that insertion scan key is fine. It's been called that for
almost twenty years. It's not like it's an intuitive concept that
could be conveyed easily if only we came up with a new, pithy name.

Yeah. It's been like that forever, but I must confess I hadn't paid any attention to it, until now. I had not understood that text in the README explaining the difference between search and insertion scan keys, before looking at this patch. Not sure I ever read it with any thought. Now that I understand it, I don't like the "insertion scan key" name.

BTW, I would like to hear what you think of the idea of minusinfkey
(and the !minusinfkey optimization) specifically.

I don't understand it :-(. I guess that's valuable feedback on its own. I'll spend more time reading the code around that, but meanwhile, if you can think of a simpler way to explain it in the comments, that'd be good.

The new BTInsertStateData struct also
holds the current buffer we're considering to insert to, and a
'bounds_valid' flag to indicate if the saved bounds are valid for the
current buffer. That way, it's more straightforward to clear the
'bounds_valid' flag whenever we move right.

I'm not sure that that's an improvement. Moving right should be very
rare with my patch. gcov shows that we never move right here anymore
with the regression tests, or within _bt_check_unique() -- not once.

I haven't given performance much thought, really. I don't expect this method to be any slower, but the point of the refactoring is to make the code easier to understand.

/*
   * Do the insertion. First move right to find the correct page to
   * insert to, if necessary. If we're inserting to a non-unique index,
   * _bt_search() already did this when it checked if a move to the
   * right was required for leaf page.  Insertion scankey's scantid
   * would have been filled out at the time. On a unique index, the
   * current buffer is the first buffer containing duplicates, however,
   * so we may need to move right to the correct location for this
   * tuple.
   */
if (checkingunique || itup_key->heapkeyspace)
         _bt_findinsertpage(rel, &insertstate, stack, heapRel);

newitemoff = _bt_binsrch_insert(rel, &insertstate);

_bt_insertonpg(rel, insertstate.buf, InvalidBuffer, stack, itup,
newitemoff, false);

Does this make sense?

I guess you're saying this because you noticed that the for (;;) loop
in _bt_findinsertloc() doesn't do that much in many cases, because of
the fastpath.

The idea is that _bt_findinsertpage() would not need to know whether the unique checks were performed or not. I'd like to encapsulate all the information about the "insert position we're considering" in the BTInsertStateData struct. Passing 'checkingunique' as a separate argument violates that, because when it's set, the key means something slightly different.

Hmm. Actually, with patch #2, _bt_findinsertloc() could look at whether 'scantid' is set, instead of 'checkingunique'. That would seem better. If it looks like 'checkingunique', it's making the assumption that if unique checks were not performed, then we are already positioned on the correct page, according to the heap TID. But looking at 'scantid' seems like a more direct way of getting the same information. And then we won't need to pass the 'checkingunique' flag as an "out-of-band" argument.

So I'm specifically suggesting that we replace this, in _bt_findinsertloc:

                if (!checkingunique && itup_key->heapkeyspace)
                        break;

With this:

                if (itup_key->scantid)
                        break;

And remove the 'checkingunique' argument from _bt_findinsertloc.

- Heikki

Reply via email to