The standard only defines an ORDER BY clause inside of an aggregate for
ARRAY_AGG(). As an extension to the standard, we allow it for all
aggregates, which is very convenient for non-standard things like
string_agg().
However, it is completely useless for things like AVG() or SUM(). If
you include it, the aggregate will do the sort even though it is neither
required nor desired.
I am proposing something like pg_aggregate.aggordering which would be an
enum of behaviors such as f=Forbidden, a=Allowed, r=Required. Currently
all aggregates would have 'a' but I am thinking that a lot of them could
be switched to 'f'. In that case, if a user supplies an ordering, an
error is raised.
My main motivation behind this is to be able to optimize aggregates that
could stop early such as ANY_VALUE(), but also to self-optimize queries
written in error (or ignorance).
There is recurring demand for a first_agg() of some sort, and that one
(whether implemented in core or left to extensions) would use 'r' so
that an error is raised if the user does not supply an ordering.
I have not started working on this because I envision quite a lot of
bikeshedding, but this is the approach I am aiming for.
Thoughts?
--
Vik Fearing
- Ordering behavior for aggregates Vik Fearing
-