Re: Patch: Global Unique Index

David Zhang Fri, 02 Dec 2022 15:29:52 -0800

Thanks a lot for all the comments.

On 2022-11-29 3:13 p.m., Tom Lane wrote:

... not to mention creating a high probability of deadlocks between
concurrent insertions to different partitions.  If they each
ex-lock their own partition's index before starting to look into
other partitions' indexes, it seems like a certainty that such
cases would fail.  The rule of thumb about locking multiple objects
is that all comers had better do it in the same order, and this
isn't doing that.

In the current POC patch, the deadlock is happening when backend-1inserts a value to index X(partition-1), and backend-2 try to insert aconflict value right after backend-1 released the buffer block lock butbefore start to check unique on index Y(partition-2). In this case,backend-1 holds ExclusiveLock on transaction-1 and waits for ShareLockon transaction-2 , while backend-2 holds ExclusiveLock on transaction-2and waits for ShareLock on transaction-1. Based on my debugging tests,this only happens when backend-1 and backend-2 want to insert a conflictvalue. If this is true, then is it ok to either `deadlock` error out or`duplicated value` error out since this is a conflict value? (hopefullyend users can handle it in a similar way). I think the probability ofsuch deadlock has two conditions: 1) users insert a conflict value andplus 2) the uniqueness checking happens in the right moment (see above).

That specific issue could perhaps be fixed by having everybody
examine all the indexes in the same order, inserting when you
come to your own partition's index and otherwise just checking
for conflicts.  But that still means serializing insertions
across all the partitions.  And the fact that you need to lock
all the partitions, or even just know what they all are,

Here is the main change for insertion cross-partition uniqueness checkin `0004-support-global-unique-index-insert-and-update.patch`, result = _bt_doinsert(rel, itup, checkUnique, indexUnchanged,heapRel);


+    if (checkUnique != UNIQUE_CHECK_NO)
+        btinsert_check_unique_gi(itup, rel, heapRel, checkUnique);
+
     pfree(itup);

where, a cross-partition uniqueness check is added after the index tuplebtree insertion on current partition. The idea is to make sure otherbackends can find out the ongoing index tuple just inserted (but beforemarked as visible yet), and the current partition uniqueness check canbe skipped as it has already been checked. Based on this change, I thinkthe insertion serialization can happen in two cases: 1) two insertionshappen on the same buffer block (buffer lock waiting); 2) two ongoinginsertions with duplicated values (transaction id waiting);

Re: Patch: Global Unique Index

Reply via email to