> On Wed, Jan 22, 2020 at 07:50:30AM +0000, Floris Van Nee wrote: > > Anyone please correct me if I'm wrong, but I think one case where the current > patch relies on some data from the page it has locked before it in checking > this hi/lo key. I think it's possible for the following sequence to happen. > Suppose we have a very simple one leaf-page btree containing four elements: > leaf page 1 = [2,4,6,8] > We do a backwards index skip scan on this and have just returned our first > tuple (8). The buffer is left pinned but unlocked. Now, someone else comes in > and inserts a tuple (value 5) into this page, but suppose the page happens to > be full. So a page split occurs. As far as I know, a page split could happen > at any random element in the page. One of the situations we could be left > with is: > Leaf page 1 = [2,4] > Leaf page 2 = [5,6,8] > However, our scan is still pointing to leaf page 1.
In case if we just returned a tuple, the next action would be either check the next page for another key or search down to the tree. Maybe I'm missing something in your scenario, but the latter will land us on a required page (we do not point to any leaf here), and before the former there is a check for high/low key. Is there anything else missing? > Now that I look at the patch again, I fear there currently may also be such a > dependency in the "Advance forward but read backward"-case. It saves the > offset number of a tuple in a variable, then does a _bt_search (releasing the > lock and pin on the page). At this point, anything can happen to the tuples > on this page - the page may be compacted by vacuum such that the offset > number you have in your variable does not match the actual offset number of > the tuple on the page anymore. Then, at the check for (nextOffset == > startOffset) later, there's a possibility the offsets are different even > though they relate to the same tuple. Interesting point. The original idea here was to check that we're not returned to the same position after jumping, so maybe instead of offsets we can check a tuple we found.