Re: Conflict detection and logging in logical replication

Dilip Kumar Mon, 29 Jul 2024 03:59:43 -0700

On Mon, Jul 29, 2024 at 11:44 AM Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>


I was going through v7-0001, and I have some initial comments.

@@ -536,11 +542,9 @@ ExecCheckIndexConstraints(ResultRelInfo
*resultRelInfo, TupleTableSlot *slot,
  ExprContext *econtext;
  Datum values[INDEX_MAX_KEYS];
  bool isnull[INDEX_MAX_KEYS];
- ItemPointerData invalidItemPtr;
  bool checkedIndex = false;

  ItemPointerSetInvalid(conflictTid);
- ItemPointerSetInvalid(&invalidItemPtr);

  /*
  * Get information from the result relation info structure.
@@ -629,7 +633,7 @@ ExecCheckIndexConstraints(ResultRelInfo
*resultRelInfo, TupleTableSlot *slot,

  satisfiesConstraint =
  check_exclusion_or_unique_constraint(heapRelation, indexRelation,
- indexInfo, &invalidItemPtr,
+ indexInfo, &slot->tts_tid,
  values, isnull, estate, false,
  CEOUC_WAIT, true,
  conflictTid);

What is the purpose of this change?  I mean
'check_exclusion_or_unique_constraint' says that 'tupleid'
should be invalidItemPtr if the tuple is not yet inserted and
ExecCheckIndexConstraints is called by ExecInsert
before inserting the tuple.  So what is this change? Would this change
ExecInsert's behavior as well?

----
----

+ReCheckConflictIndexes(ResultRelInfo *resultRelInfo, EState *estate,
+    ConflictType type, List *recheckIndexes,
+    TupleTableSlot *slot)
+{
+ /* Re-check all the unique indexes for potential conflicts */
+ foreach_oid(uniqueidx, resultRelInfo->ri_onConflictArbiterIndexes)
+ {
+ TupleTableSlot *conflictslot;
+
+ if (list_member_oid(recheckIndexes, uniqueidx) &&
+ FindConflictTuple(resultRelInfo, estate, uniqueidx, slot, &conflictslot))
+ {
+ RepOriginId origin;
+ TimestampTz committs;
+ TransactionId xmin;
+
+ GetTupleCommitTs(conflictslot, &xmin, &origin, &committs);
+ ReportApplyConflict(ERROR, type, resultRelInfo->ri_RelationDesc, uniqueidx,
+ xmin, origin, committs, conflictslot);
+ }
+ }
+}
 and

+ conflictindexes = resultRelInfo->ri_onConflictArbiterIndexes;
+
  if (resultRelInfo->ri_NumIndices > 0)
  recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
-    slot, estate, false, false,
-    NULL, NIL, false);
+    slot, estate, false,
+    conflictindexes ? true : false,
+    &conflict,
+    conflictindexes, false);
+
+ /*
+ * Rechecks the conflict indexes to fetch the conflicting local tuple
+ * and reports the conflict. We perform this check here, instead of
+ * perform an additional index scan before the actual insertion and
+ * reporting the conflict if any conflicting tuples are found. This is
+ * to avoid the overhead of executing the extra scan for each INSERT
+ * operation, even when no conflict arises, which could introduce
+ * significant overhead to replication, particularly in cases where
+ * conflicts are rare.
+ */
+ if (conflict)
+ ReCheckConflictIndexes(resultRelInfo, estate, CT_INSERT_EXISTS,
+    recheckIndexes, slot);


 This logic is confusing, first, you are calling
ExecInsertIndexTuples() with no duplicate error for the indexes
present in 'ri_onConflictArbiterIndexes' which means
 the indexes returned by the function must be a subset of
'ri_onConflictArbiterIndexes' and later in ReCheckConflictIndexes()
you are again processing all the
 indexes of 'ri_onConflictArbiterIndexes' and checking if any of these
is a subset of the indexes that is returned by
ExecInsertIndexTuples().

 Why are we doing that, I think we can directly use the
'recheckIndexes' which is returned by ExecInsertIndexTuples(), and
those indexes are guaranteed to be a subset of
 ri_onConflictArbiterIndexes. No?

 ---
 ---


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Conflict detection and logging in logical replication

Reply via email to