Re: general purpose array_sort

Junwang Zhao Mon, 31 Mar 2025 04:02:54 -0700

Hi Tom,

On Mon, Mar 31, 2025 at 5:58 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> I spent some time looking at the v17 patchset.  There were some pretty
> strange things in it --- why were some of the variants of array_sort()
> marked as volatile, for example?


I think this was due to some copy-paste of the code nearby.

> But the two things I'd like to
> suggest functionality-wise are:
>
> * The second argument of the variants with booleans should be defined
> as true=descending, not true=ascending.  It seems a little odd to me
> for the default of a boolean option not to be "false".  Also, then
> you don't need an inversion between the second and third arguments.
> I'm not dead set on this but it just seems a little cleaner.
>
Agreed.

> * I see that the code is set up to detect an unsortable input type
> before it takes the fast exit for "no sort required".  I think this
> is poor engineering: we ought to make the fast path as fast as
> possible.  The can't-sort case is so rare in real-world usage that
> I do not think it matters if the error isn't thrown by every possible
> call.  Besides which, it is inconsistent anyway: consider
>         SELECT array_sort(NULL::xid[]);
> which will not error because it will never reach the C code.  Why's
> that okay but delivering an answer for "array_sort('{1}'::xid[])"
> is not?  I think "throw error only if we must sort and cannot" is
> a perfectly fine definition.
Agreed.

>
> At the code level, I didn't like the way that the multiple entry
> points were set up.  I think it's generally cleaner code to have
> a worker function with plain C call and return coding and make
> all the SQL-visible functions be wrappers around that.  Also the
> caching mechanism was overcomplicated, in particular because we
> do not need a cache lookup to know which sort operators apply to
> arrays.

Agreed, your refactor made the code cleaner.

>
> So all that leads me to v18 attached.  (I merged the two patches
> into one, didn't see much value in splitting them.)
>
> In v18, it's somewhat annoying that the typcache doesn't cache
> the typarray field; we would not need a separate get_array_type()
> lookup if it did.  I doubt there is any real reason for that except
> that pg_type.typarray didn't exist when the typcache was invented.
> So I'm tempted to add it.  But I looked at existing callers of
> get_array_type() and none of them are adjacent to typcache lookups,
> so only array_sort would be helped immediately.  I left it alone
> for the moment; wonder if anyone else has an opinion?

The need for `elmtyp` and `array_type` here because a column can
have arrays with varying dimensions. Maybe other callers don't share
this behavior?


>
>                         regards, tom lane
>


-- 
Regards
Junwang Zhao

Re: general purpose array_sort

Reply via email to