Re: [WIP] [B-Tree] Retail IndexTuple deletion

Peter Geoghegan Fri, 22 Jun 2018 13:15:02 -0700

On Fri, Jun 22, 2018 at 12:43 PM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Fri, Jun 22, 2018 at 4:24 AM, Andrey V. Lepikhov
> <a.lepik...@postgrespro.ru> wrote:
>> According to your feedback, i develop second version of the patch.
>> In this version:
>> 1. High-level functions index_beginscan(), index_rescan() not used. Tree
>> descent made by _bt_search(). _bt_binsrch() used for positioning on the
>> page.
>> 2. TID list introduced in amtargetdelete() interface. Now only one tree
>> descent needed for deletion all tid's from the list with equal scan key
>> value - logical duplicates deletion problem.
>> 3. Only one WAL record for index tuple deletion per leaf page per
>> amtargetdelete() call.
>
> Cool.
>
> What is this "race" code about?


I noticed another bug in your patch, when running a
"wal_consistency_checking=all" smoke test. I do this simple, generic
test for anything that touches WAL-logging, actually -- it's a good
practice to adopt.

I enable "wal_consistency_checking=all" on the installation, create a
streaming replica with pg_basebackup (which also has
"wal_consistency_checking=all"), and then run "make installcheck"
against the primary. Here is what I see on the standby when I do this
with v2 of your patch applied:

9524/2018-06-22 13:03:12 PDT LOG:  entering standby mode
9524/2018-06-22 13:03:12 PDT LOG:  consistent recovery state reached
at 0/30000D0
9524/2018-06-22 13:03:12 PDT LOG:  invalid record length at 0/30000D0:
wanted 24, got 0
9523/2018-06-22 13:03:12 PDT LOG:  database system is ready to accept
read only connections
9528/2018-06-22 13:03:12 PDT LOG:  started streaming WAL from primary
at 0/3000000 on timeline 1
9524/2018-06-22 13:03:12 PDT LOG:  redo starts at 0/30000D0
9524/2018-06-22 13:03:32 PDT FATAL:  inconsistent page found, rel
1663/16384/1259, forknum 0, blkno 0
9524/2018-06-22 13:03:32 PDT CONTEXT:  WAL redo at 0/3360B00 for
Heap2/CLEAN: remxid 599
9523/2018-06-22 13:03:32 PDT LOG:  startup process (PID 9524) exited
with exit code 1
9523/2018-06-22 13:03:32 PDT LOG:  terminating any other active server processes
9523/2018-06-22 13:03:32 PDT LOG:  database system is shut down

I haven't investigated this at all, but I assume that the problem is a
simple oversight. The new ItemIdSetDeadRedirect() concept that you've
introduced probably necessitates changes in both the WAL logging
routines and the redo/recovery routines. You need to go make those
changes. (By the way, I don't think you should be using the constant
"3" with the ItemIdIsDeadRedirection() macro definition.)

Let me know if you get stuck on this, or need more direction.

-- 
Peter Geoghegan

Re: [WIP] [B-Tree] Retail IndexTuple deletion

Reply via email to