Re: optimize lookups in snapshot [sub]xip arrays

Nathan Bossart Sat, 16 Jul 2022 21:00:18 -0700

Hi Andres,

Thanks for taking a look.

On Fri, Jul 15, 2022 at 01:08:57PM -0700, Andres Freund wrote:
> Hm. Are there any contexts where we'd not want the potential for failing due
> to OOM?

I'm not sure about this one.

> I wonder if we additionally / alternatively could use a faster method of
> searching the array linearly, e.g. using SIMD.

I looked into using SIMD.  The patch is attached, but it is only intended
for benchmarking purposes and isn't anywhere close to being worth serious
review.  There may be a simpler/better way to implement the linear search,
but this seemed to work well.  Overall, SIMD provided a decent improvement.
I had to increase the number of writers quite a bit in order to demonstrate
where the hash tables began winning.  Here are the numbers:

    writers head simd hash
    256     663  632  694
    512     530  618  637
    768     489  544  573
    1024    364  508  562
    2048    185  306  485
    4096    146  197  441

While it is unsurprising that the hash tables perform the best, there are a
couple of advantages to SIMD that might make that approach worth
considering.  For one, there's really no overhead (i.e., you don't need to
sort the array or build a hash table), so we can avoid picking an arbitrary
threshold and just have one code path.  Also, a SIMD implementation for a
linear search through an array of integers could likely be easily reused
elsewhere.

> Another thing that might be worth looking into is to sort the xip/subxip
> arrays into a binary-search optimized layout. That'd make the binary search
> faster, wouldn't require additional memory (a boolean indicating whether
> sorted somewhere, I guess), and would easily persist across copies of the
> snapshot.

I spent some time looking into this, but I haven't attempted to implement
it.  IIUC the most difficult part of this is sorting the array in place to
the special layout.

>> These hash tables are regarded as ephemeral; they only live in
>> process-local memory and are never rewritten, copied, or
>> serialized.
> 
> What does rewriting refer to here?

I mean that a hash table created for one snapshot will not be cleared and
reused for another.

> Not convinced that's the right idea in case of copying. I think we often end
> up copying snapshots frequently, and building & allocating the hashed xids
> separately every time seems not great.

Right.  My concern with reusing the hash tables is that we'd need to
allocate much more space that would go largely unused in many cases.

>> +    snapshot->xiph = NULL;
>> +    snapshot->subxiph = NULL;
> 
> Do we need separate hashes for these? ISTM that if we're overflowed then we
> don't need ->subxip[h], and if not, then the action for an xid being in ->xip
> and ->subxiph is the same?

Do you mean that we can combine these into one hash table?  That might
work.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 5bc2a15160..25d1a3564c 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -45,6 +45,7 @@
  */
 #include "postgres.h"
 
+#include <nmmintrin.h>
 #include <sys/stat.h>
 #include <unistd.h>
 
@@ -2271,6 +2272,40 @@ RestoreTransactionSnapshot(Snapshot snapshot, void *source_pgproc)
 	SetTransactionSnapshot(snapshot, NULL, InvalidPid, source_pgproc);
 }
 
+static inline bool
+XidInXip(TransactionId xid, TransactionId *xip, uint32 len)
+{
+	__m128i xids = _mm_set1_epi32(xid);
+	uint32  its = len & ~15;  /* round down to nearest multiple of 16 */
+	uint32	i;
+
+	for (i = 0; i < its; i += 16)
+	{
+		__m128i xips1 = _mm_loadu_si128((__m128i *) &xip[i]);
+		__m128i xips2 = _mm_loadu_si128((__m128i *) &xip[i + 4]);
+		__m128i xips3 = _mm_loadu_si128((__m128i *) &xip[i + 8]);
+		__m128i xips4 = _mm_loadu_si128((__m128i *) &xip[i + 12]);
+		__m128i result1 = _mm_cmpeq_epi32(xids, xips1);
+		__m128i result2 = _mm_cmpeq_epi32(xids, xips2);
+		__m128i result3 = _mm_cmpeq_epi32(xids, xips3);
+		__m128i result4 = _mm_cmpeq_epi32(xids, xips4);
+		__m128i tmp1 = _mm_packs_epi32(result1, result2);
+		__m128i tmp2 = _mm_packs_epi32(result3, result4);
+		__m128i result = _mm_packs_epi16(tmp1, tmp2);
+
+		if (_mm_movemask_epi8(result) != 0)
+			return true;
+	}
+
+	while (i < len)
+	{
+		if (TransactionIdEquals(xid, xip[i++]))
+			return true;
+	}
+
+	return false;
+}
+
 /*
  * XidInMVCCSnapshot
  *		Is the given XID still-in-progress according to the snapshot?
@@ -2284,8 +2319,6 @@ RestoreTransactionSnapshot(Snapshot snapshot, void *source_pgproc)
 bool
 XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 {
-	uint32		i;
-
 	/*
 	 * Make a quick range check to eliminate most XIDs without looking at the
 	 * xip arrays.  Note that this is OK even if we convert a subxact XID to
@@ -2317,13 +2350,8 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 		if (!snapshot->suboverflowed)
 		{
 			/* we have full data, so search subxip */
-			int32		j;
-
-			for (j = 0; j < snapshot->subxcnt; j++)
-			{
-				if (TransactionIdEquals(xid, snapshot->subxip[j]))
-					return true;
-			}
+			if (XidInXip(xid, snapshot->subxip, snapshot->subxcnt))
+				return true;
 
 			/* not there, fall through to search xip[] */
 		}
@@ -2344,16 +2372,11 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 				return false;
 		}
 
-		for (i = 0; i < snapshot->xcnt; i++)
-		{
-			if (TransactionIdEquals(xid, snapshot->xip[i]))
-				return true;
-		}
+		if (XidInXip(xid, snapshot->xip, snapshot->xcnt))
+			return true;
 	}
 	else
 	{
-		int32		j;
-
 		/*
 		 * In recovery we store all xids in the subxact array because it is by
 		 * far the bigger array, and we mostly don't know which xids are
@@ -2383,11 +2406,8 @@ XidInMVCCSnapshot(TransactionId xid, Snapshot snapshot)
 		 * indeterminate xid. We don't know whether it's top level or subxact
 		 * but it doesn't matter. If it's present, the xid is visible.
 		 */
-		for (j = 0; j < snapshot->subxcnt; j++)
-		{
-			if (TransactionIdEquals(xid, snapshot->subxip[j]))
-				return true;
-		}
+		if (XidInXip(xid, snapshot->subxip, snapshot->subxcnt))
+			return true;
 	}
 
 	return false;

Re: optimize lookups in snapshot [sub]xip arrays

Reply via email to