> On 2 Dec 2024, at 08:39, John Naylor <johncnaylo...@gmail.com> wrote:
> 
> On Mon, Dec 2, 2024 at 1:12 AM Andrey M. Borodin <x4...@yandex-team.ru> wrote:
>> 
>>> On 25 Nov 2024, at 17:50, John Naylor <johncnaylo...@gmail.com> wrote:
>>> 
>>> I'd like to see the two sort specializations combined
>>> into one, using a local comparator that knows when to reverse the
>>> comparison result (hope that makes sense).
>> 
>> Sure, please find attached.
>> The prototype looks somewhat ugly (we pass bool* ascending instead of bool 
>> ascending) and it cost us ~2% of performance (on my MacBook Air M3).
> 
> I haven't tried to reproduce, but the comparison function has a
> different style for DESC than the tuplesort comparators, and the style
> here has worse code density, at least in isolation. You can see the
> difference in the link below. I also found a way to make the cmp
> reversal branch-free (last example). That may not survive once it's
> inlined, of course, or could make things worse, but you can try these
> if you like:
> 
> https://godbolt.org/z/nfPMT7Enr

On my machine this test
\timing on
SELECT (sort(arr))[1] FROM arrays_to_sort;\watch 0 c=5

produces
sort_int32_cmp Time: 543.690 ms
sort_int32_cmp_2 Time: 609.019 ms
sort_int32_cmp_4 Time: 612.219 ms

So, I'd stick with sort_int32_cmp. But, perhaps, on Intel we might have 
different results.

> 
>> But is not more generic.
> 
> A lot of the churn is from v1, and made worse by v2, and that seems to
> be from getting rid of the QSORT macro:
> 
> @@ -227,9 +228,10 @@ Datum
> sort_asc(PG_FUNCTION_ARGS)
> {
>  ArrayType  *a = PG_GETARG_ARRAYTYPE_P_COPY(0);
> + bool ascending = true;
> 
>  CHECKARRVALID(a);
> - QSORT(a, 1);
> + sort_int32(ARRPTR(a), ARRNELEMS(a), &ascending);
>  PG_RETURN_POINTER(a);
> }
> 
> The macro hides some details -- can we put "ascending" inside there?
Done.


Best regards, Andrey Borodin.

Attachment: v3-0001-Use-specialized-sort-facilities.patch
Description: Binary data

Reply via email to