On Tue, Oct 23, 2018 at 10:43:38AM +0900, Michael Paquier wrote:
> Well, following the same kind of thoughts, txid_current_snapshot() uses
> sort_snapshot() to remove all the duplicates after fetching its data
> from GetSnapshotData(), so wouldn't we want to do something about
> removal of duplicates if dummy PGXACT entries are found while scanning
> the ProcArray also in this case?  What I would think we should do is not
> only to patch GetRunningTransactionData() but also GetSnapshotData() so
> as we don't have duplicates also in this case, and do things in such a
> way that both code paths use the same logic, and that we don't need to
> have sort_snapshot() anymore.  That would be more costly though...

My apologies it took a bit longer than I thought.  I got a patch on my
desk for a couple of days, and finally took the time to finish something
which would address the concerns raised here.  As long as we don't reach
more than hundreds of thousands of entries, there is not going to be any
performance impact.  So what I do in the attached is to revert 1df21ddb,
and then have GetRunningTransactionData() sort the XIDs in the snapshot
and remove duplicates only if at least one dummy proc entry is found
while scanning, for xids and subxids.  This way, there is no need to
impact most of the instance deployments with the extra sort/removal
phase as most don't use two-phase transactions.  The sorting done at
recovery when initializing the standby snapshot still needs to happen of
course.

The patch is added to the upcoming CF for review.

Thanks,
--
Michael
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 908f62d37e..625c8ddf48 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -799,21 +799,10 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 		qsort(xids, nxids, sizeof(TransactionId), xidComparator);
 
 		/*
-		 * Add the sorted snapshot into KnownAssignedXids.  The running-xacts
-		 * snapshot may include duplicated xids because of prepared
-		 * transactions, so ignore them.
+		 * Add the sorted snapshot into KnownAssignedXids.
 		 */
 		for (i = 0; i < nxids; i++)
-		{
-			if (i > 0 && TransactionIdEquals(xids[i - 1], xids[i]))
-			{
-				elog(DEBUG1,
-					 "found duplicated transaction %u for KnownAssignedXids insertion",
-					 xids[i]);
-				continue;
-			}
 			KnownAssignedXidsAdd(xids[i], xids[i], true);
-		}
 
 		KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
 	}
@@ -1924,10 +1913,9 @@ ProcArrayInstallRestoredXmin(TransactionId xmin, PGPROC *proc)
  * This is never executed during recovery so there is no need to look at
  * KnownAssignedXids.
  *
- * Dummy PGXACTs from prepared transaction are included, meaning that this
- * may return entries with duplicated TransactionId values coming from
- * transaction finishing to prepare.  Nothing is done about duplicated
- * entries here to not hold on ProcArrayLock more than necessary.
+ * If dummy entries for prepared transactions are found in the proc array
+ * make sure to discard any duplicates as a transaction finishing to
+ * prepare may cause two entries with the same TransactionId to be found.
  *
  * We don't worry about updating other counters, we want to keep this as
  * simple as possible and leave GetSnapshotData() as the primary code for
@@ -1951,6 +1939,7 @@ GetRunningTransactionData(void)
 	int			count;
 	int			subcount;
 	bool		suboverflowed;
+	bool		found_dummy = false;
 
 	Assert(!RecoveryInProgress());
 
@@ -1999,6 +1988,7 @@ GetRunningTransactionData(void)
 	{
 		int			pgprocno = arrayP->pgprocnos[index];
 		volatile PGXACT *pgxact = &allPgXact[pgprocno];
+		volatile PGPROC *proc = &allProcs[pgprocno];
 		TransactionId xid;
 
 		/* Fetch xid just once - see GetNewTransactionId */
@@ -2022,6 +2012,13 @@ GetRunningTransactionData(void)
 		if (pgxact->overflowed)
 			suboverflowed = true;
 
+		/*
+		 * Mark that a dummy entry has been found, and that it is necessary to
+		 * check for duplicates entries.
+		 */
+		if (proc->pid == 0)
+			found_dummy = true;
+
 		/*
 		 * If we wished to exclude xids this would be the right place for it.
 		 * Procs with the PROC_IN_VACUUM flag set don't usually assign xids,
@@ -2033,6 +2030,34 @@ GetRunningTransactionData(void)
 		xids[count++] = xid;
 	}
 
+	/*
+	 * If some dummy entries have been found, it is possible that a two-phase
+	 * transaction just finishing to prepare has a duplicated entry with still
+	 * the active transaction entry in the procArray. There is no need to
+	 * remove any duplicates if no dummy entries have been found.
+	 */
+	if (found_dummy && count > 1)
+	{
+		TransactionId last = 0;
+		int			idx1,
+					idx2;
+		int			former_count;
+
+		qsort(xids, count, sizeof(TransactionId), xidComparator);
+
+		/* remove duplicate xids */
+		former_count = count;
+		idx1 = idx2 = 0;
+		while (idx1 < former_count)
+		{
+			if (TransactionIdEquals(xids[idx1], last))
+				last = xids[idx2++] = xids[idx1];
+			else
+				count--;
+			idx1++;
+		}
+	}
+
 	/*
 	 * Spin over procArray collecting all subxids, but only if there hasn't
 	 * been a suboverflow.
@@ -2065,6 +2090,40 @@ GetRunningTransactionData(void)
 				 */
 			}
 		}
+
+		/*
+		 * Per the first array scanning, we know that there are dummy proc
+		 * array entries, so sort and scan again the result, removing this
+		 * time all subxids.
+		 */
+		if (found_dummy && subcount > 1)
+		{
+			TransactionId last = 0;
+			int			idx1,
+						idx2;
+			int			former_count;
+
+			qsort(&xids[count], count - subcount,
+				  sizeof(TransactionId), xidComparator);
+
+			/*
+			 * Remove duplicate subxids, beginning from the position they have
+			 * been added.
+			 */
+			former_count = count;
+			idx1 = idx2 = count - subcount;
+			while (idx1 < former_count)
+			{
+				if (TransactionIdEquals(xids[idx1], last))
+					last = xids[idx2++] = xids[idx1];
+				else
+				{
+					count--;
+					subcount--;
+				}
+				idx1++;
+			}
+		}
 	}
 
 	/*

Attachment: signature.asc
Description: PGP signature

Reply via email to