On 09/01/2026 14:06, David Geier wrote:
On 06.01.2026 18:00, Heikki Linnakangas wrote:On 05/01/2026 17:01, David Geier wrote:- v1-0002-Optimized-comparison-functions.patch: Use FunctionCallInvoke() instead of FunctionCall2Coll(). This saves a bunch of per-comparison setup code, such as calling InitFunctionCallInfoData().You lose the check for NULL result with this. That's probably still worth checking.It seems like existing code where all args are not null, has that safety check. Added it for consistency.- v1-0004-Avoid-dedup-and-sort-in-ginExtractEntries.patch ginExtractEntries() deduplicates and sorts the entries returned from the extract value function. In case of pg_trgm, that is completely redundant because the trigrams are already deduplicated and sorted. The current version of this patch is just to demonstrate the potential. We need to think about what we want here. Ideally, we would require the extraction function to provide the entries deduplicated and sorted. Alternatively, we could indicate to ginExtractEntries() if the entries are already deduplicated and sorted. If we don't want to alter the signature of the extract value function, we could e.g. use the MSB of the nentries argument.Yeah, this seems wrong as it is. You're assuming that if the extract function returns nullFlags == NULL, the array is already sorted and deduped.As said, that was just for demonstration purposes of the possible gains. I've changed the code now such that the extractValue function of the GIN index can indicate via the third argument uniqueAndSorted, if the returned keys are already unique and sorted. Unfortunately, it seems like this patch regresses performance. See measurements below. I haven't had the time to investigate why that is. It's pretty counter intuitive, given that this patch effectively only removes code. Maybe you could re-test patch 0004 and share your runtimes?
Looking at 0001 and 0004 patches and ginExtractEntries(), the way ginExtractEntries() handles NULLs looks a little inefficient. It treats NULLs as proper entries, passing them through qsort() for deduplication and comparing them with cmpEntries(). But the end result is always that if the opclass's extractValueFn() function returned any NULLs, then there's exctly one GIN_CAT_NULL_KEY as the last entry of the result array. Surely we could be smarter about how we accomplish that. Your 0004 patch eliminates the deduplication overhead altogether, which is great of course, but the point remains for when we still need the deduplication.
Attached is an attempt at that. It avoids the null-checks in cmpEntries(), saving a few cycles. That's drowned in noise with your test cases, but with the attached test case with int arrays, I'm seeing a 1-2 % gain. That's not much, but I think it's still worth doing because it makes the code a little simpler too, IMHO. (I didn't test it together with the rest of your patches.)
Perhaps we should introduce a new qunique_eq() function with a different callback signature: /* like qunique(), but the callback function returns true/false rather than int */ static inline size_t qunique_eq(void *array, size_t elements, size_t width, bool (*equal) (const void *, const void *))I would prefer to change qunique() instead. That would enforce using an adequate comparison function from the get go. There are only ~15 calls to qunique(). So refactoring this should also be a fairly small patch. I can do that if there's agreement for that approach.
Works for me.At quick glance, most if not all of the qunique() calls call qsort() just before qunique(). I wonder if we should have a single "sort and deduplicate" function, instead. It could perhaps do some deduplication "on the go", or other optimizations.
Did you measure how big is the impact from each individual patch? Patches 1 and 2 seem pretty much ready to be committed, but I wonder if they make any difference on their own.Here is the impact of each patch. I ran again CREATE INDEX three times and took the fastest run. The run of each patch includes all previous patches as well. For example, the timings for patch 0003 were measured with a binary that also had patch 0002 and 0001 applied. To get the impact of each patch in isolation, the delta to the previous run was taken. Code | movies |delta | lineitem | delta ------------------------------------|--------|-------|------------------ master | 10,311 | 0 | 256,986 | 0 v1-0001-Inline-ginCompareAttEntries | 9,694 | 617 | 239,778 | 17,208 v1-0002-Optimized-comparison-func | 9,510 | 184 | 238,094 | 1,684 v1-0003-Use-sort_template.h | 8,661 | 849 | 231,190 | 6,904 v1-0004-Avoid-dedup-and-sort-in | 9,305 | -644 | 232,472 | -1,282 v1-0005-Make-btint4cmp-branchless | 8,240 | 1,065 | 228,387 | 4,085 v1-0006-Use-radix-sort | 6,976 | 1,264 | 207,687 | 20,700 v1-0007-Faster-qunique-comparator | 5,911 | 1,065 | 203,744 | 3,943 v1-0008-Add-ASCII-fastpath | 3,409 | 2,502 | 161,469 | 42,275 Attached is v2 of the patch set with the aforementioned changes. I've also fixed the white space errors in 0003, 0004 and 0008, as reported by Kirill.
Thanks, I pushed patch 0001 now, that's a simple and clear win. - Heikki
From a0348e8dedb9a7ee53af8f88adbba0e5aeff577d Mon Sep 17 00:00:00 2001 From: Heikki Linnakangas <[email protected]> Date: Fri, 9 Jan 2026 19:31:19 +0200 Subject: [PATCH 1/1] Make the deduplication in ginExtractEntries() a little faster The idea is to remove NULLs from the array first, and use qsort to deduplicate only the non-NULL items. That simplifies the comparison function a little. --- src/backend/access/gin/ginutil.c | 139 +++++++++++++------------------ src/include/access/gin_private.h | 2 +- 2 files changed, 61 insertions(+), 80 deletions(-) diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c index d205093e21d..29966b9d05b 100644 --- a/src/backend/access/gin/ginutil.c +++ b/src/backend/access/gin/ginutil.c @@ -387,19 +387,6 @@ GinInitMetabuffer(Buffer b) ((char *) metadata + sizeof(GinMetaPageData)) - (char *) page; } -/* - * Support for sorting key datums in ginExtractEntries - * - * Note: we only have to worry about null and not-null keys here; - * ginExtractEntries never generates more than one placeholder null, - * so it doesn't have to sort those. - */ -typedef struct -{ - Datum datum; - bool isnull; -} keyEntryData; - typedef struct { FmgrInfo *cmpDatumFunc; @@ -410,24 +397,14 @@ typedef struct static int cmpEntries(const void *a, const void *b, void *arg) { - const keyEntryData *aa = (const keyEntryData *) a; - const keyEntryData *bb = (const keyEntryData *) b; + const Datum *aa = (const Datum *) a; + const Datum *bb = (const Datum *) b; cmpEntriesArg *data = (cmpEntriesArg *) arg; int res; - if (aa->isnull) - { - if (bb->isnull) - res = 0; /* NULL "=" NULL */ - else - res = 1; /* NULL ">" not-NULL */ - } - else if (bb->isnull) - res = -1; /* not-NULL "<" NULL */ - else - res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc, - data->collation, - aa->datum, bb->datum)); + res = DatumGetInt32(FunctionCall2Coll(data->cmpDatumFunc, + data->collation, + *aa, *bb)); /* * Detect if we have any duplicates. If there are equal keys, qsort must @@ -450,11 +427,13 @@ cmpEntries(const void *a, const void *b, void *arg) Datum * ginExtractEntries(GinState *ginstate, OffsetNumber attnum, Datum value, bool isNull, - int32 *nentries, GinNullCategory **categories) + int32 *nentries_p, GinNullCategory **categories_p) { Datum *entries; bool *nullFlags; - int32 i; + GinNullCategory *categories; + bool hasNull; + int32 nentries; /* * We don't call the extractValueFn on a null item. Instead generate a @@ -462,42 +441,60 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum, */ if (isNull) { - *nentries = 1; + *nentries_p = 1; entries = palloc_object(Datum); entries[0] = (Datum) 0; - *categories = palloc_object(GinNullCategory); - (*categories)[0] = GIN_CAT_NULL_ITEM; + *categories_p = palloc_object(GinNullCategory); + (*categories_p)[0] = GIN_CAT_NULL_ITEM; return entries; } /* OK, call the opclass's extractValueFn */ nullFlags = NULL; /* in case extractValue doesn't set it */ + nentries = 0; entries = (Datum *) DatumGetPointer(FunctionCall3Coll(&ginstate->extractValueFn[attnum - 1], ginstate->supportCollation[attnum - 1], value, - PointerGetDatum(nentries), + PointerGetDatum(&nentries), PointerGetDatum(&nullFlags))); /* * Generate a placeholder if the item contained no keys. */ - if (entries == NULL || *nentries <= 0) + if (entries == NULL || nentries <= 0) { - *nentries = 1; + *nentries_p = 1; entries = palloc_object(Datum); entries[0] = (Datum) 0; - *categories = palloc_object(GinNullCategory); - (*categories)[0] = GIN_CAT_EMPTY_ITEM; + *categories_p = palloc_object(GinNullCategory); + (*categories_p)[0] = GIN_CAT_EMPTY_ITEM; return entries; } /* - * If the extractValueFn didn't create a nullFlags array, create one, - * assuming that everything's non-null. + * Scan the items for any NULLs. All NULLs are considered equal, so we + * just need to check and remember if there are any. We remove them from + * the array here, and if necessary, put back one NULL entry to represent + * them all after deduplication. */ - if (nullFlags == NULL) - nullFlags = (bool *) palloc0(*nentries * sizeof(bool)); + hasNull = false; + if (nullFlags) + { + int32 numNonNulls = 0; + + for (int32 i = 0; i < nentries; i++) + { + if (nullFlags[i]) + hasNull = true; + else + { + entries[numNonNulls] = entries[i]; + numNonNulls++; + } + } + nentries = numNonNulls; + } /* * If there's more than one key, sort and unique-ify. @@ -506,63 +503,47 @@ ginExtractEntries(GinState *ginstate, OffsetNumber attnum, * pretty bad too. For small numbers of keys it'd likely be better to use * a simple insertion sort. */ - if (*nentries > 1) + if (nentries > 1) { - keyEntryData *keydata; cmpEntriesArg arg; - keydata = palloc_array(keyEntryData, *nentries); - for (i = 0; i < *nentries; i++) - { - keydata[i].datum = entries[i]; - keydata[i].isnull = nullFlags[i]; - } - arg.cmpDatumFunc = &ginstate->compareFn[attnum - 1]; arg.collation = ginstate->supportCollation[attnum - 1]; arg.haveDups = false; - qsort_arg(keydata, *nentries, sizeof(keyEntryData), + qsort_arg(entries, nentries, sizeof(Datum), cmpEntries, &arg); if (arg.haveDups) { /* there are duplicates, must get rid of 'em */ - int32 j; + int32 j = 1; - entries[0] = keydata[0].datum; - nullFlags[0] = keydata[0].isnull; - j = 1; - for (i = 1; i < *nentries; i++) + for (int32 i = 1; i < nentries; i++) { - if (cmpEntries(&keydata[i - 1], &keydata[i], &arg) != 0) - { - entries[j] = keydata[i].datum; - nullFlags[j] = keydata[i].isnull; - j++; - } + if (cmpEntries(&entries[i - 1], &entries[i], &arg) != 0) + entries[j++] = entries[i]; } - *nentries = j; + nentries = j; } - else - { - /* easy, no duplicates */ - for (i = 0; i < *nentries; i++) - { - entries[i] = keydata[i].datum; - nullFlags[i] = keydata[i].isnull; - } - } - - pfree(keydata); } /* - * Create GinNullCategory representation from nullFlags. + * Create GinNullCategory representation. */ - *categories = (GinNullCategory *) palloc0(*nentries * sizeof(GinNullCategory)); - for (i = 0; i < *nentries; i++) - (*categories)[i] = (nullFlags[i] ? GIN_CAT_NULL_KEY : GIN_CAT_NORM_KEY); + categories = (GinNullCategory *) palloc((nentries + (hasNull ? 1 : 0)) * sizeof(GinNullCategory)); + for (int32 i = 0; i < nentries; i++) + categories[i] = GIN_CAT_NORM_KEY; + + /* Put back a NULL entry, if there were any */ + if (hasNull) + { + entries[nentries] = (Datum) 0; + categories[nentries] = GIN_CAT_NULL_KEY; + nentries++; + } + *nentries_p = nentries; + *categories_p = categories; return entries; } diff --git a/src/include/access/gin_private.h b/src/include/access/gin_private.h index e155045ce8a..1e97e6cb3c9 100644 --- a/src/include/access/gin_private.h +++ b/src/include/access/gin_private.h @@ -99,7 +99,7 @@ extern void GinInitPage(Page page, uint32 f, Size pageSize); extern void GinInitMetabuffer(Buffer b); extern Datum *ginExtractEntries(GinState *ginstate, OffsetNumber attnum, Datum value, bool isNull, - int32 *nentries, GinNullCategory **categories); + int32 *nentries_p, GinNullCategory **categories_p); extern OffsetNumber gintuple_get_attrnum(GinState *ginstate, IndexTuple tuple); extern Datum gintuple_get_key(GinState *ginstate, IndexTuple tuple, -- 2.47.3
test-gin-array.sql
Description: application/sql
