Re: [WIP] [B-Tree] Retail IndexTuple deletion

Andrey V. Lepikhov Mon, 25 Jun 2018 11:21:03 -0700



On 23.06.2018 01:14, Peter Geoghegan wrote:

On Fri, Jun 22, 2018 at 12:43 PM, Peter Geoghegan <[email protected]> wrote:

On Fri, Jun 22, 2018 at 4:24 AM, Andrey V. Lepikhov
<[email protected]> wrote:

According to your feedback, i develop second version of the patch.
In this version:
1. High-level functions index_beginscan(), index_rescan() not used. Tree
descent made by _bt_search(). _bt_binsrch() used for positioning on the
page.
2. TID list introduced in amtargetdelete() interface. Now only one tree
descent needed for deletion all tid's from the list with equal scan key
value - logical duplicates deletion problem.
3. Only one WAL record for index tuple deletion per leaf page per
amtargetdelete() call.


Cool.

What is this "race" code about?

I introduce this check because keep in mind about another vacuumworkers, which can make cleaning a relation concurrently. may be it isredundant.

I noticed another bug in your patch, when running a
"wal_consistency_checking=all" smoke test. I do this simple, generic
test for anything that touches WAL-logging, actually -- it's a good
practice to adopt.

I enable "wal_consistency_checking=all" on the installation, create a
streaming replica with pg_basebackup (which also has
"wal_consistency_checking=all"), and then run "make installcheck"
against the primary. Here is what I see on the standby when I do this
with v2 of your patch applied:

9524/2018-06-22 13:03:12 PDT LOG:  entering standby mode
9524/2018-06-22 13:03:12 PDT LOG:  consistent recovery state reached
at 0/30000D0
9524/2018-06-22 13:03:12 PDT LOG:  invalid record length at 0/30000D0:
wanted 24, got 0
9523/2018-06-22 13:03:12 PDT LOG:  database system is ready to accept
read only connections
9528/2018-06-22 13:03:12 PDT LOG:  started streaming WAL from primary
at 0/3000000 on timeline 1
9524/2018-06-22 13:03:12 PDT LOG:  redo starts at 0/30000D0
9524/2018-06-22 13:03:32 PDT FATAL:  inconsistent page found, rel
1663/16384/1259, forknum 0, blkno 0
9524/2018-06-22 13:03:32 PDT CONTEXT:  WAL redo at 0/3360B00 for
Heap2/CLEAN: remxid 599
9523/2018-06-22 13:03:32 PDT LOG:  startup process (PID 9524) exited
with exit code 1
9523/2018-06-22 13:03:32 PDT LOG:  terminating any other active server processes
9523/2018-06-22 13:03:32 PDT LOG:  database system is shut down

I haven't investigated this at all, but I assume that the problem is a
simple oversight. The new ItemIdSetDeadRedirect() concept that you've
introduced probably necessitates changes in both the WAL logging
routines and the redo/recovery routines. You need to go make those
changes. (By the way, I don't think you should be using the constant
"3" with the ItemIdIsDeadRedirection() macro definition.)

Let me know if you get stuck on this, or need more direction.

I was investigated the bug of the simple smoke test. You're right: makeany manipulations with line pointer in heap_page_prune() withoutreflection in WAL record is no good idea.But this consistency problem arises even on clean PostgreSQLinstallation (without my patch) with ItemIdSetDead() -> ItemIdMarkDead()replacement.Byte-by-byte comparison of master and replay pages shows only 2 bytesdifference in the tuple storage part of page.

I don't stuck on yet, but good ideas are welcome.

--
Andrey Lepikhov
Postgres Professional:
https://postgrespro.com
The Russian Postgres Company

Re: [WIP] [B-Tree] Retail IndexTuple deletion

Reply via email to