On Sun, Aug 4, 2024 at 1:22 PM Zhijie Hou (Fujitsu) <houzj.f...@fujitsu.com> wrote: > > On Friday, August 2, 2024 7:03 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > Here is the V11 patch set which addressed above and Kuroda-san[1] comments. >
A few design-level points: * @@ -525,10 +602,33 @@ ExecSimpleRelationInsert(ResultRelInfo *resultRelInfo, /* OK, store the tuple and create index entries for it */ simple_table_tuple_insert(resultRelInfo->ri_RelationDesc, slot); + conflictindexes = resultRelInfo->ri_onConflictArbiterIndexes; + if (resultRelInfo->ri_NumIndices > 0) recheckIndexes = ExecInsertIndexTuples(resultRelInfo, - slot, estate, false, false, - NULL, NIL, false); + slot, estate, false, + conflictindexes ? true : false, + &conflict, + conflictindexes, false); + + /* + * Checks the conflict indexes to fetch the conflicting local tuple + * and reports the conflict. We perform this check here, instead of + * performing an additional index scan before the actual insertion and + * reporting the conflict if any conflicting tuples are found. This is + * to avoid the overhead of executing the extra scan for each INSERT + * operation, even when no conflict arises, which could introduce + * significant overhead to replication, particularly in cases where + * conflicts are rare. + * + * XXX OTOH, this could lead to clean-up effort for dead tuples added + * in heap and index in case of conflicts. But as conflicts shouldn't + * be a frequent thing so we preferred to save the performance overhead + * of extra scan before each insertion. + */ + if (conflict) + CheckAndReportConflict(resultRelInfo, estate, CT_INSERT_EXISTS, + recheckIndexes, slot); I was thinking about this case where we have some pros and cons of doing additional scans only after we found the conflict. I was wondering how we will handle the resolution strategy for this when we have to remote_apply the tuple for insert_exists/update_exists cases. We would have already inserted the remote tuple in the heap and index before we found the conflict which means we have to roll back that change and then start a forest transaction to perform remote_apply which probably has to update the existing tuple. We may have to perform something like speculative insertion and then abort it. That doesn't sound cheap either. Do you have any better ideas? * -ERROR: duplicate key value violates unique constraint "test_pkey" -DETAIL: Key (c)=(1) already exists. +ERROR: conflict insert_exists detected on relation "public.test" +DETAIL: Key (c)=(1) already exists in unique index "t_pkey", which was modified locally in transaction 740 at 2024-06-26 10:47:04.727375+08. I think the format to display conflicts is not very clear. The conflict should be apparent just by seeing the LOG/ERROR message. I am thinking of something like below: LOG: CONFLICT: <insert_exisits or whatever names we document>; DESCRIPTION: If any .. ; RESOLUTION: (This one can be added later) DEATAIL: remote_tuple (tuple values); local_tuple (tuple values); With the above, one can easily identify the conflict's reason and action taken by apply worker. -- With Regards, Amit Kapila.