On Fri, Aug 16, 2019 at 8:56 AM Anastasia Lubennikova
<a.lubennik...@postgrespro.ru> wrote:
> Now the algorithm is the following:
>
> - If bt_findinsertloc() found out that tuple belongs to existing posting 
> tuple's
> TID interval, it sets 'in_posting_offset' variable and passes it to
> _bt_insertonpg()
>
> - If 'in_posting_offset' is valid and origtup is valid,
> merge our itup into origtup.
>
> It can result in one tuple neworigtup, that must replace origtup; or two 
> tuples:
> neworigtup and newrighttup, if the result exceeds BTMaxItemSize,

That sounds like the right way to do it.

> - If two new tuple(s) fit into the old page, we're lucky.
> call _bt_delete_and_insert(..., neworigtup, newrighttup, newitemoff) to
> atomically replace oldtup with new tuple(s) and generate xlog record.
>
> - In case page split is needed, pass both tuples to _bt_split().
>  _bt_findsplitloc() is now aware of upcoming replacement of origtup with
> neworigtup, so it uses correct item size where needed.

That makes sense, since _bt_split() is responsible for both splitting
the page, and inserting the new item on either the left or right page,
as part of the first phase of a page split. In other words, if you're
adding something new to _bt_insertonpg(), you probably also need to
add something new to _bt_split(). So that's what you did.

> It seems that now all replace operations are crash-safe. The new patch passes
> all regression tests, so I think it's ready for review again.

I'm looking at it now. I'm going to spend a significant amount of time
on this tomorrow.

I think that we should start to think about efficient WAL-logging now.

> In the meantime, I'll run more stress-tests.

As you probably realize, wal_consistency_checking is a good thing to
use with your tests here.

-- 
Peter Geoghegan


Reply via email to