The standard only defines an ORDER BY clause inside of an aggregate for ARRAY_AGG(). As an extension to the standard, we allow it for all aggregates, which is very convenient for non-standard things like string_agg().

However, it is completely useless for things like AVG() or SUM(). If you include it, the aggregate will do the sort even though it is neither required nor desired.

I am proposing something like pg_aggregate.aggordering which would be an enum of behaviors such as f=Forbidden, a=Allowed, r=Required. Currently all aggregates would have 'a' but I am thinking that a lot of them could be switched to 'f'. In that case, if a user supplies an ordering, an error is raised.

My main motivation behind this is to be able to optimize aggregates that could stop early such as ANY_VALUE(), but also to self-optimize queries written in error (or ignorance).

There is recurring demand for a first_agg() of some sort, and that one (whether implemented in core or left to extensions) would use 'r' so that an error is raised if the user does not supply an ordering.

I have not started working on this because I envision quite a lot of bikeshedding, but this is the approach I am aiming for.

Thoughts?
--
Vik Fearing


Reply via email to