On 31.07.2020 23:28, Robert Haas wrote:
On Tue, Jul 14, 2020 at 1:51 PM Anastasia Lubennikova
<a.lubennik...@postgrespro.ru> wrote:
Questions from the first review pass:
1) Do we need XLH_INSERT_ALL_VISIBLE_SET ? IIUC, in the patch it is always
implied by XLH_INSERT_ALL_FROZEN_SET.
I agree that it looks unnecessary to have two separate bits.
2) What does this comment mean? Does XXX refers to the lsn comparison?
Since it
is definitely necessary to update the VM.
+ * XXX: This seems entirely unnecessary?
+ *
+ * FIXME: Theoretically we should only do this after we've
+ * modified the heap - but it's safe to do it here I think,
+ * because this means that the page previously was empty.
+ */
+ if (lsn > PageGetLSN(vmpage))
+ visibilitymap_set(reln, blkno, InvalidBuffer, lsn,
vmbuffer,
+ InvalidTransactionId, vmbits);
I wondered about that too. The comment which precedes it was, I
believe, originally written by me, and copied here from
heap_xlog_visible(). But it's not clear very good practice to just
copy the comment like this. If the same logic applies, the code should
say that we're doing the same thing here as in heap_xlog_visible() for
consistency, or some such thing; after all, that's the primary place
where that happens. But it looks like the XXX might have been added by
a second person who thought that we didn't need this logic at all, and
the FIXME by a third person who thought it was in the wrong place, so
the whole thing is really confusing at this point.
I'm pretty worried about this, too:
+ * FIXME: setting recptr here is a dirty dirty hack, to prevent
+ * visibilitymap_set() from WAL logging.
That is indeed a dirty hack, and something needs to be done about it.
I wonder if it was really all that smart to try to make the
HEAP2_MULTI_INSERT do this instead of just issuing separate WAL
records to mark it all-visible afterwards, but I don't see any reason
why this can't be made to work. It needs substantially more polishing
than it's had, though, I think.
New version of the patch is in the attachment.
New design is more conservative and simpler:
- pin the visibility map page in advance;
- set PageAllVisible;
- call visibilitymap_set() with its own XlogRecord, just like in other
places.
It allows to remove most of the "hacks" and keep code clean.
The patch passes tests added in previous versions.
I haven't tested performance yet, though. Maybe after tests, I'll bring
some optimizations back.
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
commit 7a5dfaa525ba89b86663de417638cdcb30ed147b
Author: anastasia <a.lubennik...@postgrespro.ru>
Date: Sun Aug 2 15:28:51 2020 +0300
copy-freeze-vm_freeze_v1.patch
Set VM all_visible and all_frozen bits, when COPY FREEZE inserts tuples into empty page
diff --git a/contrib/pg_visibility/expected/pg_visibility.out b/contrib/pg_visibility/expected/pg_visibility.out
index ca4b6e186b..0017e3415c 100644
--- a/contrib/pg_visibility/expected/pg_visibility.out
+++ b/contrib/pg_visibility/expected/pg_visibility.out
@@ -179,6 +179,69 @@ select pg_truncate_visibility_map('test_partition');
(1 row)
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen
+-------+-------------+------------
+ 0 | t | t
+ 1 | t | t
+ 2 | t | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid
+--------
+(0 rows)
+
+-- load half the rows via regular COPY and rest via COPY FREEZE. The pages
+-- which are touched by regular COPY must not be set all-visible/all-frozen. On
+-- the other hand, pages allocated by COPY FREEZE should be marked
+-- all-frozen/all-visible.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen
+-------+-------------+------------
+ 0 | f | f
+ 1 | f | f
+ 2 | t | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid
+--------
+(0 rows)
+
+-- Try a mix of regular COPY and COPY FREEZE.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+copy copyfreeze from stdin;
+copy copyfreeze from stdin freeze;
+commit;
+select * from pg_visibility_map('copyfreeze');
+ blkno | all_visible | all_frozen
+-------+-------------+------------
+ 0 | t | t
+ 1 | f | f
+ 2 | t | t
+(3 rows)
+
+select * from pg_check_frozen('copyfreeze');
+ t_ctid
+--------
+(0 rows)
+
-- cleanup
drop table test_partitioned;
drop view test_view;
@@ -188,3 +251,4 @@ drop server dummy_server;
drop foreign data wrapper dummy;
drop materialized view matview_visibility_test;
drop table regular_table;
+drop table copyfreeze;
diff --git a/contrib/pg_visibility/sql/pg_visibility.sql b/contrib/pg_visibility/sql/pg_visibility.sql
index f79b54480b..ec1afd4906 100644
--- a/contrib/pg_visibility/sql/pg_visibility.sql
+++ b/contrib/pg_visibility/sql/pg_visibility.sql
@@ -94,6 +94,82 @@ select count(*) > 0 from pg_visibility_map_summary('test_partition');
select * from pg_check_frozen('test_partition'); -- hopefully none
select pg_truncate_visibility_map('test_partition');
+-- test copy freeze
+create table copyfreeze (a int, b char(1500));
+
+-- load all rows via COPY FREEZE and ensure that all pages are set all-visible
+-- and all-frozen.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+1 '1'
+2 '2'
+3 '3'
+4 '4'
+5 '5'
+6 '6'
+7 '7'
+8 '8'
+9 '9'
+10 '10'
+11 '11'
+12 '12'
+\.
+commit;
+select * from pg_visibility_map('copyfreeze');
+select * from pg_check_frozen('copyfreeze');
+
+-- load half the rows via regular COPY and rest via COPY FREEZE. The pages
+-- which are touched by regular COPY must not be set all-visible/all-frozen. On
+-- the other hand, pages allocated by COPY FREEZE should be marked
+-- all-frozen/all-visible.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin;
+1 '1'
+2 '2'
+3 '3'
+4 '4'
+5 '5'
+6 '6'
+\.
+copy copyfreeze from stdin freeze;
+7 '7'
+8 '8'
+9 '9'
+10 '10'
+11 '11'
+12 '12'
+\.
+commit;
+select * from pg_visibility_map('copyfreeze');
+select * from pg_check_frozen('copyfreeze');
+
+-- Try a mix of regular COPY and COPY FREEZE.
+begin;
+truncate copyfreeze;
+copy copyfreeze from stdin freeze;
+1 '1'
+2 '2'
+3 '3'
+4 '4'
+5 '5'
+\.
+copy copyfreeze from stdin;
+6 '6'
+\.
+copy copyfreeze from stdin freeze;
+7 '7'
+8 '8'
+9 '9'
+10 '10'
+11 '11'
+12 '12'
+\.
+commit;
+select * from pg_visibility_map('copyfreeze');
+select * from pg_check_frozen('copyfreeze');
+
-- cleanup
drop table test_partitioned;
drop view test_view;
@@ -103,3 +179,4 @@ drop server dummy_server;
drop foreign data wrapper dummy;
drop materialized view matview_visibility_test;
drop table regular_table;
+drop table copyfreeze;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2c9bb0c7ee..d6f9266bd7 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2084,6 +2084,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
int ndone;
PGAlignedBlock scratch;
Page page;
+ Buffer vmbuffer = InvalidBuffer;
bool needwal;
Size saveFreeSpace;
bool need_tuple_data = RelationIsLogicallyLogged(relation);
@@ -2138,8 +2139,9 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
while (ndone < ntuples)
{
Buffer buffer;
- Buffer vmbuffer = InvalidBuffer;
+ bool starting_with_empty_page;
bool all_visible_cleared = false;
+ bool all_frozen_set = false;
int nthispage;
CHECK_FOR_INTERRUPTS();
@@ -2147,12 +2149,20 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/*
* Find buffer where at least the next tuple will fit. If the page is
* all-visible, this will also pin the requisite visibility map page.
+ *
+ * Also pin visibility map page if COPY FREEZE inserts tuples into an
+ * empty page. See all_frozen_set below.
*/
buffer = RelationGetBufferForTuple(relation, heaptuples[ndone]->t_len,
InvalidBuffer, options, bistate,
&vmbuffer, NULL);
page = BufferGetPage(buffer);
+ starting_with_empty_page = PageGetMaxOffsetNumber(page) == 0;
+
+ if (starting_with_empty_page && (options & HEAP_INSERT_FROZEN))
+ all_frozen_set = true;
+
/* NO EREPORT(ERROR) from here till changes are logged */
START_CRIT_SECTION();
@@ -2186,7 +2196,11 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
log_heap_new_cid(relation, heaptup);
}
- if (PageIsAllVisible(page))
+ /*
+ * If the page is all visible, need to clear that, unless we're only
+ * going to add further frozen rows to it.
+ */
+ if (PageIsAllVisible(page) && !(options & HEAP_INSERT_FROZEN))
{
all_visible_cleared = true;
PageClearAllVisible(page);
@@ -2194,6 +2208,12 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
BufferGetBlockNumber(buffer),
vmbuffer, VISIBILITYMAP_VALID_BITS);
}
+ /*
+ * If we're only adding already frozen rows, and the page was
+ * previously empty, mark it as all-visible.
+ */
+ else if (all_frozen_set)
+ PageSetAllVisible(page);
/*
* XXX Should we set PageSetPrunable on this page ? See heap_insert()
@@ -2217,8 +2237,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
* If the page was previously empty, we can reinit the page
* instead of restoring the whole thing.
*/
- init = (ItemPointerGetOffsetNumber(&(heaptuples[ndone]->t_self)) == FirstOffsetNumber &&
- PageGetMaxOffsetNumber(page) == FirstOffsetNumber + nthispage - 1);
+ init = starting_with_empty_page;
/* allocate xl_heap_multi_insert struct from the scratch area */
xlrec = (xl_heap_multi_insert *) scratchptr;
@@ -2236,7 +2255,13 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
/* the rest of the scratch space is used for tuple data */
tupledata = scratchptr;
- xlrec->flags = all_visible_cleared ? XLH_INSERT_ALL_VISIBLE_CLEARED : 0;
+ xlrec->flags = 0;
+ Assert (!(all_visible_cleared && all_frozen_set));
+ if (all_visible_cleared)
+ xlrec->flags |= XLH_INSERT_ALL_VISIBLE_CLEARED;
+ if (all_frozen_set)
+ xlrec->flags |= XLH_INSERT_ALL_FROZEN_SET;
+
xlrec->ntuples = nthispage;
/*
@@ -2310,13 +2335,39 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
END_CRIT_SECTION();
- UnlockReleaseBuffer(buffer);
- if (vmbuffer != InvalidBuffer)
- ReleaseBuffer(vmbuffer);
+ if (all_frozen_set)
+ {
+ Assert(PageIsAllVisible(page));
+ /*
+ * vmbuffer should be already pinned by RelationGetBufferForTuple,
+ * Though, it's fine if is not. all_frozen is just an optimization,
+ */
+ if (visibilitymap_pin_ok(BufferGetBlockNumber(buffer), vmbuffer))
+ /*
+ * It's fine to use InvalidTransactionId here - this is only used
+ * when HEAP_INSERT_FROZEN is specified, which intentionally
+ * violates visibility rules.
+ */
+ visibilitymap_set(relation, BufferGetBlockNumber(buffer), buffer,
+ InvalidXLogRecPtr, vmbuffer,
+ InvalidTransactionId,
+ VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN);
+ }
+
+ UnlockReleaseBuffer(buffer);
ndone += nthispage;
+
+ /*
+ * NB: Only release vmbuffer after inserting all tuples - it's fairly
+ * likely that we'll insert into subsequent heap pages that are likely
+ * to use the same vm page.
+ */
}
+ if (vmbuffer != InvalidBuffer)
+ ReleaseBuffer(vmbuffer);
+
/*
* We're done with the actual inserts. Check for conflicts again, to
* ensure that all rw-conflicts in to these inserts are detected. Without
@@ -8203,6 +8254,7 @@ heap_xlog_multi_insert(XLogReaderState *record)
BlockNumber blkno;
Buffer buffer;
Page page;
+ Page vmpage;
union
{
HeapTupleHeaderData hdr;
@@ -8227,7 +8279,8 @@ heap_xlog_multi_insert(XLogReaderState *record)
* The visibility map may need to be fixed even if the heap page is
* already up-to-date.
*/
- if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
+ if ((xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED) &&
+ !(xlrec->flags & XLH_INSERT_ALL_FROZEN_SET))
{
Relation reln = CreateFakeRelcacheEntry(rnode);
Buffer vmbuffer = InvalidBuffer;
@@ -8311,6 +8364,8 @@ heap_xlog_multi_insert(XLogReaderState *record)
if (xlrec->flags & XLH_INSERT_ALL_VISIBLE_CLEARED)
PageClearAllVisible(page);
+ if (xlrec->flags & XLH_INSERT_ALL_FROZEN_SET)
+ PageSetAllVisible(page);
MarkBufferDirty(buffer);
}
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index aa3f14c019..ec435869cf 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -142,6 +142,9 @@ GetVisibilityMapPins(Relation relation, Buffer buffer1, Buffer buffer2,
/* Figure out which pins we need but don't have. */
need_to_pin_buffer1 = PageIsAllVisible(BufferGetPage(buffer1))
&& !visibilitymap_pin_ok(block1, *vmbuffer1);
+
+ // TODO do we need special code for COPY FREEZE here?
+
need_to_pin_buffer2 = buffer2 != InvalidBuffer
&& PageIsAllVisible(BufferGetPage(buffer2))
&& !visibilitymap_pin_ok(block2, *vmbuffer2);
@@ -422,6 +425,14 @@ loop:
buffer = ReadBufferBI(relation, targetBlock, RBM_NORMAL, bistate);
if (PageIsAllVisible(BufferGetPage(buffer)))
visibilitymap_pin(relation, targetBlock, vmbuffer);
+ /*
+ * This is for COPY FREEZE needs. If page is empty,
+ * pin vmbuffer to set all_frozen bit
+ */
+ if ((options & HEAP_INSERT_FROZEN) &&
+ (PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0))
+ visibilitymap_pin(relation, BufferGetBlockNumber(buffer), vmbuffer);
+
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
}
else if (otherBlock == targetBlock)
@@ -608,6 +619,16 @@ loop:
PageInit(page, BufferGetPageSize(buffer), 0);
MarkBufferDirty(buffer);
+ /*
+ * This is for COPY FREEZE needs. If page is empty,
+ * pin vmbuffer to set all_frozen bit
+ */
+ if ((options & HEAP_INSERT_FROZEN))
+ {
+ Assert(PageGetMaxOffsetNumber(BufferGetPage(buffer)) == 0);
+ visibilitymap_pin(relation, BufferGetBlockNumber(buffer), vmbuffer);
+ }
+
/*
* Release the file-extension lock; it's now OK for someone else to extend
* the relation some more.
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index 95d18cdb12..f01360f6ad 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -67,6 +67,8 @@
#define XLH_INSERT_LAST_IN_MULTI (1<<1)
#define XLH_INSERT_IS_SPECULATIVE (1<<2)
#define XLH_INSERT_CONTAINS_NEW_TUPLE (1<<3)
+/* all_frozen_set always implies all_visible_set */
+#define XLH_INSERT_ALL_FROZEN_SET (1<<4)
/*
* xl_heap_update flag values, 8 bits are available.