On 09/01/2026 14:06, David Geier wrote:
On 06.01.2026 18:00, Heikki Linnakangas wrote:
On 05/01/2026 17:01, David Geier wrote:
- v1-0002-Optimized-comparison-functions.patch: Use FunctionCallInvoke()
instead of FunctionCall2Coll(). This saves a bunch of per-comparison
setup code, such as calling InitFunctionCallInfoData().

You lose the check for NULL result with this. That's probably still
worth checking.

It seems like existing code where all args are not null, has that safety
check. Added it for consistency.

- v1-0004-Avoid-dedup-and-sort-in-ginExtractEntries.patch
ginExtractEntries() deduplicates and sorts the entries returned from the
extract value function. In case of pg_trgm, that is completely redundant
because the trigrams are already deduplicated and sorted. The current
version of this patch is just to demonstrate the potential. We need to
think about what we want here. Ideally, we would require the extraction
function to provide the entries deduplicated and sorted. Alternatively,
we could indicate to ginExtractEntries() if the entries are already
deduplicated and sorted. If we don't want to alter the signature of the
extract value function, we could e.g. use the MSB of the nentries
argument.

Yeah, this seems wrong as it is. You're assuming that if the extract
function returns nullFlags == NULL, the array is already sorted and
deduped.

As said, that was just for demonstration purposes of the possible gains.
I've changed the code now such that the extractValue function of the GIN
index can indicate via the third argument uniqueAndSorted, if the
returned keys are already unique and sorted.

Unfortunately, it seems like this patch regresses performance. See
measurements below. I haven't had the time to investigate why that is.
It's pretty counter intuitive, given that this patch effectively only
removes code. Maybe you could re-test patch 0004 and share your runtimes?

Looking at 0001 and 0004 patches and ginExtractEntries(), the way ginExtractEntries() handles NULLs looks a little inefficient. It treats NULLs as proper entries, passing them through qsort() for deduplication and comparing them with cmpEntries(). But the end result is always that if the opclass's extractValueFn() function returned any NULLs, then there's exctly one GIN_CAT_NULL_KEY as the last entry of the result array. Surely we could be smarter about how we accomplish that. Your 0004 patch eliminates the deduplication overhead altogether, which is great of course, but the point remains for when we still need the deduplication.

Attached is an attempt at that. It avoids the null-checks in cmpEntries(), saving a few cycles. That's drowned in noise with your test cases, but with the attached test case with int arrays, I'm seeing a 1-2 % gain. That's not much, but I think it's still worth doing because it makes the code a little simpler too, IMHO. (I didn't test it together with the rest of your patches.)

Perhaps we should introduce a new qunique_eq() function with a different
callback signature:

/* like qunique(), but the callback function returns true/false rather
than int */
static inline size_t
qunique_eq(void *array, size_t elements, size_t width,
         bool (*equal) (const void *, const void *))


I would prefer to change qunique() instead. That would enforce using an
adequate comparison function from the get go. There are only ~15 calls
to qunique(). So refactoring this should also be a fairly small patch. I
can do that if there's agreement for that approach.

Works for me.

At quick glance, most if not all of the qunique() calls call qsort() just before qunique(). I wonder if we should have a single "sort and deduplicate" function, instead. It could perhaps do some deduplication "on the go", or other optimizations.

Did you measure how big is the impact from each individual patch?
Patches 1 and 2 seem pretty much ready to be committed, but I wonder if
they make any difference on their own.

Here is the impact of each patch. I ran again CREATE INDEX three times
and took the fastest run. The run of each patch includes all previous
patches as well. For example, the timings for patch 0003 were measured
with a binary that also had patch 0002 and 0001 applied. To get the
impact of each patch in isolation, the delta to the previous run was taken.

Code                                | movies |delta  | lineitem | delta
------------------------------------|--------|-------|------------------
master                              | 10,311 | 0     | 256,986  | 0
v1-0001-Inline-ginCompareAttEntries |  9,694 | 617   | 239,778  | 17,208
v1-0002-Optimized-comparison-func   |  9,510 | 184   | 238,094  |  1,684
v1-0003-Use-sort_template.h         |  8,661 | 849   | 231,190  |  6,904
v1-0004-Avoid-dedup-and-sort-in     |  9,305 | -644  | 232,472  | -1,282
v1-0005-Make-btint4cmp-branchless   |  8,240 | 1,065 | 228,387  |  4,085
v1-0006-Use-radix-sort              |  6,976 | 1,264 | 207,687  | 20,700
v1-0007-Faster-qunique-comparator   |  5,911 | 1,065 | 203,744  |  3,943
v1-0008-Add-ASCII-fastpath          |  3,409 | 2,502 | 161,469  | 42,275

Attached is v2 of the patch set with the aforementioned changes. I've
also fixed the white space errors in 0003, 0004 and 0008, as reported by
Kirill.

Thanks, I pushed patch 0001 now, that's a simple and clear win.

- Heikki
From a0348e8dedb9a7ee53af8f88adbba0e5aeff577d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Fri, 9 Jan 2026 19:31:19 +0200
Subject: [PATCH 1/1] Make the deduplication in ginExtractEntries() a little
 faster

The idea is to remove NULLs from the array first, and use qsort to
deduplicate only the non-NULL items. That simplifies the comparison
function a little.
---
 src/backend/access/gin/ginutil.c | 139 +++++++++++++------------------
 src/include/access/gin_private.h |   2 +-
 2 files changed, 61 insertions(+), 80 deletions(-)

diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index d205093e21d..29966b9d05b 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -387,19 +387,6 @@ GinInitMetabuffer(Buffer b)
 		((char *) metadata + sizeof(GinMetaPageData)) - (char *) page;
 }
 
-/*
- * Support for sorting key datums in ginExtractEntries
- *
- * Note: we only have to worry about null and not-null keys here;
- * ginExtractEntries never generates more than one placeholder null,
- * so it doesn't have to sort those.
- */
-typedef struct
-{
-	Datum		datum;
-	bool		isnull;
-} keyEntryData;
-
 typedef struct
 {
 	FmgrInfo   *cmpDatumFunc;
@@ -410,24 +397,14 @@ typedef struct
 static int
 cmpEntries(const void *a, const void *b, void *arg)
 {
-	const keyEntryData *aa = (const keyEntryData *) a;
-	const keyEntryData *bb = (const keyEntryData *) b;
+	const Datum *aa = (const Datum *) a;
+	const Datum *bb = (const Datum *) b;
 	cmpEntriesArg *data = (cmpEntriesArg *) arg;
 	int			res;
 
-	if (aa->isnull)
-	{
-		if (bb->isnull)
-			res = 0;			/* NULL "=" NULL */
-		else
-			res = 1;			/* NULL ">" not-NULL */
-	}
-	else if (bb->isnull)
-		res = -1;				/* not-NULL "<" NULL */
-	else
-		res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc,
-											  data->collation,
-											  aa->datum, bb->datum));
+	res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc,
+										  data->collation,
+										  *aa, *bb));
 
 	/*
 	 * Detect if we have any duplicates.  If there are equal keys, qsort must
@@ -450,11 +427,13 @@ cmpEntries(const void *a, const void *b, void *arg)
 Datum *
 ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
 				  Datum value, bool isNull,
-				  int32 *nentries, GinNullCategory **categories)
+				  int32 *nentries_p, GinNullCategory **categories_p)
 {
 	Datum	   *entries;
 	bool	   *nullFlags;
-	int32		i;
+	GinNullCategory *categories;
+	bool		hasNull;
+	int32		nentries;
 
 	/*
 	 * We don't call the extractValueFn on a null item.  Instead generate a
@@ -462,42 +441,60 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
 	 */
 	if (isNull)
 	{
-		*nentries = 1;
+		*nentries_p = 1;
 		entries = palloc_object(Datum);
 		entries[0] = (Datum) 0;
-		*categories = palloc_object(GinNullCategory);
-		(*categories)[0] = GIN_CAT_NULL_ITEM;
+		*categories_p = palloc_object(GinNullCategory);
+		(*categories_p)[0] = GIN_CAT_NULL_ITEM;
 		return entries;
 	}
 
 	/* OK, call the opclass's extractValueFn */
 	nullFlags = NULL;			/* in case extractValue doesn't set it */
+	nentries = 0;
 	entries = (Datum *)
 		DatumGetPointer(FunctionCall3Coll(&ginstate->extractValueFn[attnum - 1],
 										  ginstate->supportCollation[attnum - 1],
 										  value,
-										  PointerGetDatum(nentries),
+										  PointerGetDatum(&nentries),
 										  PointerGetDatum(&nullFlags)));
 
 	/*
 	 * Generate a placeholder if the item contained no keys.
 	 */
-	if (entries == NULL || *nentries <= 0)
+	if (entries == NULL || nentries <= 0)
 	{
-		*nentries = 1;
+		*nentries_p = 1;
 		entries = palloc_object(Datum);
 		entries[0] = (Datum) 0;
-		*categories = palloc_object(GinNullCategory);
-		(*categories)[0] = GIN_CAT_EMPTY_ITEM;
+		*categories_p = palloc_object(GinNullCategory);
+		(*categories_p)[0] = GIN_CAT_EMPTY_ITEM;
 		return entries;
 	}
 
 	/*
-	 * If the extractValueFn didn't create a nullFlags array, create one,
-	 * assuming that everything's non-null.
+	 * Scan the items for any NULLs.  All NULLs are considered equal, so we
+	 * just need to check and remember if there are any.  We remove them from
+	 * the array here, and if necessary, put back one NULL entry to represent
+	 * them all after deduplication.
 	 */
-	if (nullFlags == NULL)
-		nullFlags = (bool *) palloc0(*nentries * sizeof(bool));
+	hasNull = false;
+	if (nullFlags)
+	{
+		int32		numNonNulls = 0;
+
+		for (int32 i = 0; i < nentries; i++)
+		{
+			if (nullFlags[i])
+				hasNull = true;
+			else
+			{
+				entries[numNonNulls] = entries[i];
+				numNonNulls++;
+			}
+		}
+		nentries = numNonNulls;
+	}
 
 	/*
 	 * If there's more than one key, sort and unique-ify.
@@ -506,63 +503,47 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
 	 * pretty bad too.  For small numbers of keys it'd likely be better to use
 	 * a simple insertion sort.
 	 */
-	if (*nentries > 1)
+	if (nentries > 1)
 	{
-		keyEntryData *keydata;
 		cmpEntriesArg arg;
 
-		keydata = palloc_array(keyEntryData, *nentries);
-		for (i = 0; i < *nentries; i++)
-		{
-			keydata[i].datum = entries[i];
-			keydata[i].isnull = nullFlags[i];
-		}
-
 		arg.cmpDatumFunc = &ginstate->compareFn[attnum - 1];
 		arg.collation = ginstate->supportCollation[attnum - 1];
 		arg.haveDups = false;
-		qsort_arg(keydata, *nentries, sizeof(keyEntryData),
+		qsort_arg(entries, nentries, sizeof(Datum),
 				  cmpEntries, &arg);
 
 		if (arg.haveDups)
 		{
 			/* there are duplicates, must get rid of 'em */
-			int32		j;
+			int32		j = 1;
 
-			entries[0] = keydata[0].datum;
-			nullFlags[0] = keydata[0].isnull;
-			j = 1;
-			for (i = 1; i < *nentries; i++)
+			for (int32 i = 1; i < nentries; i++)
 			{
-				if (cmpEntries(&keydata[i - 1], &keydata[i], &arg) != 0)
-				{
-					entries[j] = keydata[i].datum;
-					nullFlags[j] = keydata[i].isnull;
-					j++;
-				}
+				if (cmpEntries(&entries[i - 1], &entries[i], &arg) != 0)
+					entries[j++] = entries[i];
 			}
-			*nentries = j;
+			nentries = j;
 		}
-		else
-		{
-			/* easy, no duplicates */
-			for (i = 0; i < *nentries; i++)
-			{
-				entries[i] = keydata[i].datum;
-				nullFlags[i] = keydata[i].isnull;
-			}
-		}
-
-		pfree(keydata);
 	}
 
 	/*
-	 * Create GinNullCategory representation from nullFlags.
+	 * Create GinNullCategory representation.
 	 */
-	*categories = (GinNullCategory *) palloc0(*nentries * sizeof(GinNullCategory));
-	for (i = 0; i < *nentries; i++)
-		(*categories)[i] = (nullFlags[i] ? GIN_CAT_NULL_KEY : GIN_CAT_NORM_KEY);
+	categories = (GinNullCategory *) palloc((nentries + (hasNull ? 1 : 0)) * sizeof(GinNullCategory));
+	for (int32 i = 0; i < nentries; i++)
+		categories[i] = GIN_CAT_NORM_KEY;
+
+	/* Put back a NULL entry, if there were any */
+	if (hasNull)
+	{
+		entries[nentries] = (Datum) 0;
+		categories[nentries] = GIN_CAT_NULL_KEY;
+		nentries++;
+	}
 
+	*nentries_p = nentries;
+	*categories_p = categories;
 	return entries;
 }
 
diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h
index e155045ce8a..1e97e6cb3c9 100644
--- a/src/include/access/gin_private.h
+++ b/src/include/access/gin_private.h
@@ -99,7 +99,7 @@ extern void GinInitPage(Page page, uint32 f, Size pageSize);
 extern void GinInitMetabuffer(Buffer b);
 extern Datum *ginExtractEntries(GinState *ginstate, OffsetNumber attnum,
 								Datum value, bool isNull,
-								int32 *nentries, GinNullCategory **categories);
+								int32 *nentries_p, GinNullCategory **categories_p);
 
 extern OffsetNumber gintuple_get_attrnum(GinState *ginstate, IndexTuple tuple);
 extern Datum gintuple_get_key(GinState *ginstate, IndexTuple tuple,
-- 
2.47.3

Attachment: test-gin-array.sql
Description: application/sql

Reply via email to