Tomas Vondra <tomas.von...@2ndquadrant.com> writes: > On 03/26/2018 10:27 PM, Tom Lane wrote: >> I fear that what will happen, if we commit this, is that something like >> 0.01% of the users of array_agg and string_agg will be pleased, another >> maybe 20% will be unaffected because they wrote ORDER BY which prevents >> parallel aggregation, and the remaining 80% will scream because we broke >> their queries. Telling them they should've written ORDER BY isn't going >> to cut it, IMO, when the benefit of that breakage will accrue only to some >> very tiny fraction of use-cases.
> Isn't the ordering unreliable *already*? Not if the query is such that what gets chosen is, say, an indexscan or mergejoin. It might be theoretically unreliable and yet work fine for a given application. I might be too pessimistic about the fraction of users who are depending on ordered input without having written anything that explicitly forces that ... but I stand by the theory that it substantially exceeds the fraction of users who could get any benefit. Your own example of assuming that separate aggregates are computed in the same order reinforces my point, I think. In principle, anybody who's doing that should write array_agg(e order by x), array_agg(f order by x), string_agg(g order by x) because otherwise they shouldn't assume that; the manual certainly doesn't promise it. But nobody does that in production, because if they did they'd get killed by the fact that the sorts are all done independently. (We should improve that someday, but it hasn't been done yet.) So I think there are an awful lot of people out there who are assuming more than a lawyerly reading of the manual would allow. Their reaction to this will be about like ours every time the GCC guys decide that some longstanding behavior of C code isn't actually promised by the text of the C standard. regards, tom lane