On Thu, 26 Feb 2026 at 22:03, Madhav Madhusoodanan
<[email protected]> wrote:
>
> On Tue, Aug 26, 2025 at 2:11 PM Kirk Wolak <[email protected]> wrote:
> > I do have a question, one of the IDEAS we discussed was to ADD a new page 
> > that combined the 2 pages.
>
> Would the flow then be as follows? Please correct me if I'm wrong:
> Start: Parent page P, with adjacent child pages A -> B -> C -> D.
> Pages B and C are sparse enough and are about to be merged.
> 1: Acquire lock on pages B and C
> 2: Create a new page N, which copies the tuples in pages B and C
> 3: Acquire lock on parent page P, update the separator keys in P,
> release lock on P
> 4: Update pointers such that pages link like so: A -> N -> D
> 5: Release lock on pages B and C

That is one part of the picture (the merging part), but it's missing a
lot of details:
- How do concurrent workloads (index scans, backwards index scans,
index insertions) detect and recover from concurrent merges when they
step through pages?
- Do you have a theory or proof of correctness for the above?
- Can this scheme be implemented without adding significant overhead
to current workloads that don't benefit significantly from the new
feature?

One example for a problem with the given flow: where (and how) do you
update the sibling pointers in pages A (to N from B) and D (to N from
C)?
You haven't explicitly locked pages A and D by the time you get to
step 4, even though locking pages is critical to guarantee correct and
safe operations. Simply locking them in step 4 isn't sufficient, as
another process may have locked page A in preparation for a split,
which would then be waiting to get a lock on page B. If you've already
locked page B before locking page A, you'll have a deadlock.

Additionally, how does this work when you find out in step 3 that your
pages B and C each are linked to from a different parent page? You'd
have to update both parent pages, but that would also require locking
more than just the one parent page per level that currently gets
locked during page deletion.


Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)


Reply via email to