On 03.07.2018 00:40, Peter Geoghegan wrote:
On Mon, Jul 2, 2018 at 9:28 AM, Peter Geoghegan <p...@bowt.ie> wrote:
Execution time of last "VACUUM test;" command on my notebook was:
with bulk deletion: 1.6 s;
with Quick Vacuum Strategy: 5.2 s;
with Quick Vacuum Strategy & TID sorting: 0.6 s.
I'm glad that you looked into this. You could make this faster still,
by actually passing the lowest heap TID in the list of TIDs to kill to
_bt_search() and _bt_binsrch(). You might have to work through several
extra B-Tree leaf pages per bttargetdelete() call without this (you'll
move right multiple times within bttargetdelete()).
I should add: I think that this doesn't matter so much in your
original test case with v1 of my patch, because you're naturally
accessing the index tuples in almost the most efficient way already,
since you VACUUM works its way from the start of the table until the
end of the table. You'll definitely need to pass a heap TID to
routines like _bt_search() once you start using my v2, though, since
that puts the heap TIDs in DESC sort order. Otherwise, it'll be almost
as slow as the plain "Quick Vacuum Strategy" case was.
In general, the big idea with my patch is that heap TID is just
another attribute. I am not "cheating" in any way; if it's not
possible to descend the tree and arrive at the right leaf page without
looking through several leaf pages, then my patch is broken.
You might also use _bt_moveright() with my patch. That way, you can
quickly detect that you need to move right immediately, without going
through all the items on the page. This should only be an issue in the
event of a concurrent page split, though. In my patch, I use
_bt_moveright() in a special way for unique indexes: I need to start
at the first leaf page a duplicate could be on for duplicate checking,
but once that's over I want to "jump immediately" to the leaf page the
index tuple actually needs to be inserted on. That's when
_bt_moveright() is called. (Actually, that looks like it breaks unique
index enforcement in the case of my patch, which I need to fix, but
you might still do this.)
Done.
Attachment contains an update for use v.2 of the 'Ensure nbtree leaf
tuple keys are always unique' patch.
Apply order:
1. 0001-Retail-IndexTuple-Deletion-Access-Method.patch - from previous email
2. 0002-Quick-vacuum-strategy.patch - from previous email
3. v2-0001-Ensure-nbtree-leaf-tuple-keys-are-always-unique.patch - from [1]
4. 0004-Retail-IndexTuple-Deletion-with-TID-sorting-in-leaf.patch
[1]
https://www.postgresql.org/message-id/CAH2-Wzm6D%3DKnV%2BP8bZE-ZtP4e%2BW64HtVTdOenqd1d7HjJL3xZQ%40mail.gmail.com
--
Andrey Lepikhov
Postgres Professional:
https://postgrespro.com
The Russian Postgres Company
>From 1c8569abe9479e547911ec3079633f79056eff96 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" <a.lepik...@postgrespro.ru>
Date: Tue, 3 Jul 2018 16:54:46 +0500
Subject: [PATCH 4/4] Retail-IndexTuple-Deletion-with-TID-sorting-in-leaf
---
src/backend/access/nbtree/nbtree.c | 75 +++++++++++++++++++++++++-------------
src/backend/commands/vacuumlazy.c | 8 ++--
src/include/access/genam.h | 2 +-
3 files changed, 55 insertions(+), 30 deletions(-)
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index c54aeac..7c617e9 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -887,15 +887,19 @@ btbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
return stats;
}
-static int
-tid_list_search(ItemPointer tid, ItemPointer tid_list, int ntid)
-{
- for (int i = 0; i < ntid; i++)
- if (ItemPointerEquals(tid, &(tid_list[i])))
- return i;
- return -1;
-}
-
+/*
+ * Deletion of index entries pointing to heap tuples.
+ *
+ * Constraints:
+ * 1. TID list info->dead_tuples arranged in ASC order.
+ * 2. Logical duplicates of index tuples stored in DESC order.
+ *
+ * The function generates an insertion scan key and descent by btree for first
+ * index tuple what satisfies scan key and last TID in info->dead_tuples list.
+ * For the scan results it deletes all index entries, matched to the TID list.
+ *
+ * Result: a palloc'd struct containing statistical info.
+ */
IndexTargetDeleteResult*
bttargetdelete(IndexTargetDeleteInfo *info,
IndexTargetDeleteResult *stats,
@@ -914,20 +918,21 @@ bttargetdelete(IndexTargetDeleteInfo *info,
int ndeletable = 0;
OffsetNumber deletable[MaxOffsetNumber];
IndexTuple itup;
+ int pos = info->last_dead_tuple;
if (stats == NULL)
stats = (IndexTargetDeleteResult *) palloc0(sizeof(IndexTargetDeleteResult));
+ /* Assemble scankey */
itup = index_form_tuple(RelationGetDescr(irel), values, isnull);
skey = _bt_mkscankey(irel, itup);
/* Descend the tree and position ourselves on the target leaf page. */
- stack = _bt_search(irel, keysCount, skey, false, &buf, BT_READ, NULL);
- _bt_freestack(stack);
+ stack = _bt_search(irel, keysCount, skey, &info->dead_tuples[pos], false, &buf, BT_READ, NULL);
/* To prepare tuple entries search across index pages */
Assert(BufferIsValid(buf));
- offnum = _bt_binsrch(irel, buf, keysCount, skey, false);
+ offnum = _bt_binsrch(irel, buf, keysCount, skey, &info->dead_tuples[pos], false);
page = BufferGetPage(buf);
_bt_checkpage(irel, buf);
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
@@ -937,10 +942,9 @@ bttargetdelete(IndexTargetDeleteInfo *info,
int32 cmpval;
ItemId itemid;
IndexTuple itup;
- int pos;
/* Switch to the next page */
- if (P_IGNORE(opaque) || (offnum > PageGetMaxOffsetNumber(page)))
+ if (offnum > PageGetMaxOffsetNumber(page))
{
/*
* Before unlocking index page we need to delete
@@ -965,20 +969,27 @@ bttargetdelete(IndexTargetDeleteInfo *info,
break;
/*
- * Switch to the next index page
+ * Traverse to a next reliable index page
*/
- buf = _bt_relandgetbuf(irel, buf, opaque->btpo_next, BT_READ);
+ buf = _bt_moveright(irel, buf, keysCount, skey, &info->dead_tuples[pos],
+ false, true, stack, BT_READ, NULL);
page = BufferGetPage(buf);
_bt_checkpage(irel, buf);
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
- offnum = P_FIRSTDATAKEY(opaque);
- continue;
+ Assert(!P_IGNORE(opaque));
+ /* Set offnum to first potentially interesting item */
+ offnum = _bt_binsrch(irel, buf, keysCount, skey, &info->dead_tuples[pos], false);
+
+ if (offnum > PageGetMaxOffsetNumber(page))
+ break;
+ else
+ continue;
}
/*
* This index entry satisfied to the scan key?
*/
- cmpval = _bt_compare(irel, keysCount, skey, page, offnum);
+ cmpval = _bt_compare(irel, keysCount, skey, NULL, page, offnum);
if (cmpval != 0)
/* End of index entries, satisfied to the scan key */
@@ -990,15 +1001,28 @@ bttargetdelete(IndexTargetDeleteInfo *info,
*/
itemid = PageGetItemId(page, offnum);
itup = (IndexTuple) PageGetItem(page, itemid);
- pos = tid_list_search(&(itup->t_tid), info->dead_tuples, info->num_dead_tuples);
- if ((pos >= 0) && (!info->found_dead_tuples[pos]))
+ /*
+ * Search for next TID from presorted btree result comparable
+ * to TID from presorted dead_tuples tid list
+ */
+ while (pos >= 0)
{
- /* index entry for TID of dead tuple is found */
- deletable[ndeletable++] = offnum;
- info->found_dead_tuples[pos] = true;
+ int res = ItemPointerCompare(&(itup->t_tid), &info->dead_tuples[pos]);
+ if ((res == 0) && (!info->found_dead_tuples[pos]))
+ {
+ /* index entry for TID of dead tuple is found */
+ deletable[ndeletable++] = offnum;
+ info->found_dead_tuples[pos] = true;
+ }
+ else if (res > 0)
+ break;
+ pos--;
}
+ if (pos < 0)
+ break;
+
offnum = OffsetNumberNext(offnum);
}
@@ -1015,7 +1039,8 @@ bttargetdelete(IndexTargetDeleteInfo *info,
stats->tuples_removed += ndeletable;
}
- /* Release scan key, unpin and unlock buffer */
+ /* Release stack, scan key, unpin and unlock buffer */
+ _bt_freestack(stack);
_bt_freeskey(skey);
_bt_relbuf(irel, buf);
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 0bb8abb..2fca7be 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -1770,7 +1770,7 @@ quick_vacuum_index(Relation irel, Relation hrel,
econtext->ecxt_scantuple = slot;
/* Get tuple from heap */
- for (tnum = 0; tnum < vacrelstats->num_dead_tuples; tnum++)
+ for (tnum = vacrelstats->num_dead_tuples-1; tnum >= 0; tnum--)
{
HeapTuple tuple;
Datum values[INDEX_MAX_KEYS];
@@ -1814,9 +1814,9 @@ quick_vacuum_index(Relation irel, Relation hrel,
* Make attempt to delete some index entries by one tree descent.
* We use only a part of TID list, which contains not found TID's.
*/
- ivinfo.dead_tuples = &(vacrelstats->dead_tuples[tnum]);
- ivinfo.num_dead_tuples = vacrelstats->num_dead_tuples - tnum;
- ivinfo.found_dead_tuples = found + tnum;
+ ivinfo.dead_tuples = vacrelstats->dead_tuples;
+ ivinfo.last_dead_tuple = tnum;
+ ivinfo.found_dead_tuples = found;
index_target_delete(&ivinfo, &stats, values, isnull);
}
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index 79b386e..297d357 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -55,7 +55,7 @@ typedef struct IndexTargetDeleteInfo
{
Relation heapRelation;
Relation indexRelation; /* the index being vacuumed */
- int num_dead_tuples;
+ int last_dead_tuple;
ItemPointer dead_tuples;
bool* found_dead_tuples;
} IndexTargetDeleteInfo;
--
2.7.4