On Tue, Jan 25, 2022 at 11:59 AM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 12:26 AM Euler Taveira <eu...@eulerto.com> wrote:
> >
>
> I am not sure if your proposal is much different compared to v4 or how
> much it improves the situation? I see you didn't consider
> 'check_external_attr' parameter and I think that is important to know
> if the key has any external toast value. Overall, I see your point
> that the change of APIs looks a bit ugly. But, I guess that is more
> due to their names and current purpose. I think it could be better if
> we bring all the code of heap_tuple_attr_equals in its only caller
> HeapDetermineModifiedColumns or at least part of the code where we get
> attr value and can determine whether the value is stored externally.
> Then change name of HeapDetermineModifiedColumns to
> HeapDetermineColumnsInfo with additional parameters.

I think the best way is to do some refactoring and renaming of the
function, because as part of HeapDetermineModifiedColumns we are
already processing the tuple so we can not put extra overhead of
reprocessing it again.  In short I like the idea of renaming the
HeapDetermineModifiedColumns and moving part of heap_tuple_attr_equals
code into the caller.  Here is the patch set for the same.  I have
divided it into two patches which can eventually be merged, 0001- for
refactoring 0002- does the actual work.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From db839d20ab109488670de8b480b9f30f4f30c084 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Fri, 28 Jan 2022 11:35:12 +0530
Subject: [PATCH v5 2/2] WAL log unchanged toasted eplica identity key
 attributes

Currently, if the unchanged replica identity key attributes are not
logged separately because they are getting logged as part of the
new tuple.  But if it is stored externally then the untoasted value
is not getting logged as part of the new tuple.  So we need to log
that as part of the old_key_tuple.  In order to do that we need to
identify whether any of the unchanged replica identity key attributes
is stored externally or not.  So, we are trying to identify this
while identifying the modified attributes because at that time we
are already processing the tuple so there will not be any extra
overhead.
---
 contrib/test_decoding/expected/toast.out |  2 +-
 src/backend/access/heap/heapam.c         | 77 ++++++++++++++++++++++++--------
 2 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/contrib/test_decoding/expected/toast.out b/contrib/test_decoding/expected/toast.out
index cd03e9d..a757e7d 100644
--- a/contrib/test_decoding/expected/toast.out
+++ b/contrib/test_decoding/expected/toast.out
@@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM pg_logical_slot_get_changes('regression_slot',
  table public.toasted_key: INSERT: id[integer]:1 toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
  COMMIT
  BEGIN
- table public.toasted_key: UPDATE: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'987654321098765432109876543210987654321098765432109
+ table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
  COMMIT
  BEGIN
  table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 86eb016..ab17f2e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -78,9 +78,11 @@ static XLogRecPtr log_heap_update(Relation reln, Buffer oldbuf,
 								  Buffer newbuf, HeapTuple oldtup,
 								  HeapTuple newtup, HeapTuple old_key_tuple,
 								  bool all_visible_cleared, bool new_all_visible_cleared);
-static Bitmapset *HeapDetermineModifiedColumns(Relation relation,
-											   Bitmapset *interesting_cols,
-											   HeapTuple oldtup, HeapTuple newtup);
+static Bitmapset *HeapDetermineColumnsInfo(Relation relation,
+										   Bitmapset *interesting_cols,
+										   Bitmapset *check_external_attr,
+										   HeapTuple oldtup, HeapTuple newtup,
+										   bool *id_has_external);
 static bool heap_acquire_tuplock(Relation relation, ItemPointer tid,
 								 LockTupleMode mode, LockWaitPolicy wait_policy,
 								 bool *have_tuple_lock);
@@ -106,7 +108,7 @@ static bool ConditionalMultiXactIdWait(MultiXactId multi, MultiXactStatus status
 static void index_delete_sort(TM_IndexDeleteOp *delstate);
 static int	bottomup_sort_and_shrink(TM_IndexDeleteOp *delstate);
 static XLogRecPtr log_heap_new_cid(Relation relation, HeapTuple tup);
-static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_changed,
+static HeapTuple ExtractReplicaIdentity(Relation rel, HeapTuple tup, bool key_required,
 										bool *copy);
 
 
@@ -3184,6 +3186,7 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
 	bool		all_visible_cleared_new = false;
 	bool		checked_lockers;
 	bool		locker_remains;
+	bool		id_has_external = false;
 	TransactionId xmax_new_tuple,
 				xmax_old_tuple;
 	uint16		infomask_old_tuple,
@@ -3262,9 +3265,17 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
 	/* the new tuple is ready, except for this: */
 	newtup->t_tableOid = RelationGetRelid(relation);
 
-	/* Determine columns modified by the update. */
-	modified_attrs = HeapDetermineModifiedColumns(relation, interesting_attrs,
-												  &oldtup, newtup);
+	/*
+	 * Determine columns modified by the update.  And also identify whether
+	 * any of the unmodified replica identity key attributes is externally
+	 * stored or not.  We need to identify that because if the replica identity
+	 * key attribute is unchanged and it is externally stored then its
+	 * untoasted value will not be WAL logged as part of the new tuple so we
+	 * will have to do it as part of the old_key_tuple.
+	 */
+	modified_attrs = HeapDetermineColumnsInfo(relation, interesting_attrs,
+											  id_attrs, &oldtup,
+											  newtup, &id_has_external);
 
 	/*
 	 * If we're not updating any "key" column, we can grab a weaker lock type.
@@ -3864,10 +3875,12 @@ l2:
 	 * Compute replica identity tuple before entering the critical section so
 	 * we don't PANIC upon a memory allocation failure.
 	 * ExtractReplicaIdentity() will return NULL if nothing needs to be
-	 * logged.
+	 * logged.  Pass old key required as true only if the replica identity key
+	 * columns are modified or it has external data.
 	 */
 	old_key_tuple = ExtractReplicaIdentity(relation, &oldtup,
-										   bms_overlap(modified_attrs, id_attrs),
+										   bms_overlap(modified_attrs, id_attrs) ||
+										   id_has_external,
 										   &old_key_copied);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
@@ -4073,12 +4086,20 @@ heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
  * Given an updated tuple, determine (and return into the output bitmapset),
  * from those listed as interesting, the set of columns that changed.
  *
+ * check_external_attr is bitmapset of the attribute from the old tuple for
+ * which we want to check whether any of unmodified attribute contains
+ * externally stored attribute.  And, if so then has_external will be set to
+ * true.
+ *
  * The input bitmapset is destructively modified; that is OK since this is
  * invoked at most once in heap_update.
  */
 static Bitmapset *
-HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
-							 HeapTuple oldtup, HeapTuple newtup)
+HeapDetermineColumnsInfo(Relation relation,
+						 Bitmapset *interesting_cols,
+						 Bitmapset *check_external_attr,
+						 HeapTuple oldtup, HeapTuple newtup,
+						 bool *has_external)
 {
 	int			attrnum;
 	Bitmapset  *modified = NULL;
@@ -4123,9 +4144,28 @@ HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
 
 		if (!heap_tuple_attr_equals(RelationGetDescr(relation), attrnum,
 									value1, value2, isnull1, isnull2))
+		{
 			modified = bms_add_member(modified,
 									  attrnum -
 									  FirstLowInvalidHeapAttributeNumber);
+			continue;
+		}
+
+		/*
+		 * If attribute is NULL or attlen is no -1 then it can not be stored
+		 * externally.
+		 */
+		if (isnull1 || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1)
+			continue;
+
+		/*
+		 * If the attribute is stored externally and is a member of the
+		 * check_external_attr bitmapset then set has_external to true.
+		 */
+		if (VARATT_IS_EXTERNAL((struct varlena *) DatumGetPointer(value1)) &&
+			bms_is_member(attrnum - FirstLowInvalidHeapAttributeNumber,
+						  check_external_attr))
+			*has_external = true;
 	}
 
 	return modified;
@@ -8362,14 +8402,15 @@ log_heap_new_cid(Relation relation, HeapTuple tup)
  * Returns NULL if there's no need to log an identity or if there's no suitable
  * key defined.
  *
- * key_changed should be false if caller knows that no replica identity
- * columns changed value.  It's always true in the DELETE case.
+ * key_required should be false if caller knows that no replica identity
+ * columns changed value and it doesn't has any external data.
+ * It's always true in the DELETE case.
  *
  * *copy is set to true if the returned tuple is a modified copy rather than
  * the same tuple that was passed in.
  */
 static HeapTuple
-ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
+ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_required,
 					   bool *copy)
 {
 	TupleDesc	desc = RelationGetDescr(relation);
@@ -8402,7 +8443,7 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
 	}
 
 	/* if the key hasn't changed and we're only logging the key, we're done */
-	if (!key_changed)
+	if (!key_required)
 		return NULL;
 
 	/* find out the replica identity columns */
@@ -8410,10 +8451,10 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
 										 INDEX_ATTR_BITMAP_IDENTITY_KEY);
 
 	/*
-	 * If there's no defined replica identity columns, treat as !key_changed.
+	 * If there's no defined replica identity columns, treat as !key_required.
 	 * (This case should not be reachable from heap_update, since that should
-	 * calculate key_changed accurately.  But heap_delete just passes constant
-	 * true for key_changed, so we can hit this case in deletes.)
+	 * calculate key_required accurately.  But heap_delete just passes constant
+	 * true for key_required, so we can hit this case in deletes.)
 	 */
 	if (bms_is_empty(idattrs))
 		return NULL;
-- 
1.8.3.1

From d2281c5b2e61a382d4616d780bb5cc69656ef087 Mon Sep 17 00:00:00 2001
From: Dilip Kumar <dilipkumar@localhost.localdomain>
Date: Fri, 28 Jan 2022 10:44:22 +0530
Subject: [PATCH v5 1/2] Code refactoring for the next patch

Maybe this can be merged with the next patch but kept separate for
ease of review
---
 src/backend/access/heap/heapam.c | 81 +++++++++++++++++++++-------------------
 1 file changed, 42 insertions(+), 39 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 98230aa..86eb016 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -4028,43 +4028,11 @@ l2:
  */
 static bool
 heap_tuple_attr_equals(TupleDesc tupdesc, int attrnum,
-					   HeapTuple tup1, HeapTuple tup2)
+					   Datum value1, Datum value2, bool isnull1, bool isnull2)
 {
-	Datum		value1,
-				value2;
-	bool		isnull1,
-				isnull2;
 	Form_pg_attribute att;
 
 	/*
-	 * If it's a whole-tuple reference, say "not equal".  It's not really
-	 * worth supporting this case, since it could only succeed after a no-op
-	 * update, which is hardly a case worth optimizing for.
-	 */
-	if (attrnum == 0)
-		return false;
-
-	/*
-	 * Likewise, automatically say "not equal" for any system attribute other
-	 * than tableOID; we cannot expect these to be consistent in a HOT chain,
-	 * or even to be set correctly yet in the new tuple.
-	 */
-	if (attrnum < 0)
-	{
-		if (attrnum != TableOidAttributeNumber)
-			return false;
-	}
-
-	/*
-	 * Extract the corresponding values.  XXX this is pretty inefficient if
-	 * there are many indexed columns.  Should HeapDetermineModifiedColumns do
-	 * a single heap_deform_tuple call on each tuple, instead?	But that
-	 * doesn't work for system columns ...
-	 */
-	value1 = heap_getattr(tup1, attrnum, tupdesc, &isnull1);
-	value2 = heap_getattr(tup2, attrnum, tupdesc, &isnull2);
-
-	/*
 	 * If one value is NULL and other is not, then they are certainly not
 	 * equal
 	 */
@@ -4112,17 +4080,52 @@ static Bitmapset *
 HeapDetermineModifiedColumns(Relation relation, Bitmapset *interesting_cols,
 							 HeapTuple oldtup, HeapTuple newtup)
 {
-	int			attnum;
+	int			attrnum;
 	Bitmapset  *modified = NULL;
+	TupleDesc	tupdesc = RelationGetDescr(relation);
 
-	while ((attnum = bms_first_member(interesting_cols)) >= 0)
+	while ((attrnum = bms_first_member(interesting_cols)) >= 0)
 	{
-		attnum += FirstLowInvalidHeapAttributeNumber;
+		Datum		value1,
+					value2;
+		bool		isnull1,
+					isnull2;
+
+		attrnum += FirstLowInvalidHeapAttributeNumber;
+
+		/*
+		 * If it's a whole-tuple reference, say "not equal".  It's not really
+		 * worth supporting this case, since it could only succeed after a
+		 * no-op update, which is hardly a case worth optimizing for.
+		 */
+		if (attrnum == 0)
+			continue;
+
+		/*
+		 * Likewise, automatically say "not equal" for any system attribute
+		 * other than tableOID; we cannot expect these to be consistent in a
+		 * HOT chain, or even to be set correctly yet in the new tuple.
+		 */
+		if (attrnum < 0)
+		{
+			if (attrnum != TableOidAttributeNumber)
+				continue;
+		}
+
+		/*
+		 * Extract the corresponding values.  XXX this is pretty inefficient if
+		 * there are many indexed columns.  Should HeapDetermineModifiedColumns do
+		 * a single heap_deform_tuple call on each tuple, instead?	But that
+		 * doesn't work for system columns ...
+		 */
+		value1 = heap_getattr(oldtup, attrnum, tupdesc, &isnull1);
+		value2 = heap_getattr(newtup, attrnum, tupdesc, &isnull2);
 
-		if (!heap_tuple_attr_equals(RelationGetDescr(relation),
-									attnum, oldtup, newtup))
+		if (!heap_tuple_attr_equals(RelationGetDescr(relation), attrnum,
+									value1, value2, isnull1, isnull2))
 			modified = bms_add_member(modified,
-									  attnum - FirstLowInvalidHeapAttributeNumber);
+									  attrnum -
+									  FirstLowInvalidHeapAttributeNumber);
 	}
 
 	return modified;
-- 
1.8.3.1

Reply via email to