On 28 March 2018 at 03:58, Tom Lane <t...@sss.pgh.pa.us> wrote: > David Rowley <david.row...@2ndquadrant.com> writes: >> On 27 March 2018 at 13:26, Alvaro Herrera <alvhe...@alvh.no-ip.org> wrote: >>> synchronized_seqscans is another piece of precedent in the area, FWIW. > >> This is true. I guess the order of aggregation could be made more >> certain if we remove the cost based optimiser completely, and just >> rely on a syntax based optimiser. > > None of this is responding to my point. I think the number of people > who actually don't care about aggregation order for these aggregates > is negligible, and none of you have argued against that; you've instead > selected straw men to attack.
I think what everyone else is getting at is that we're not offering to improve the performance of the people who care about the order which the values are aggregated. This patch only offers a possible performance benefit to those who don't. I mentioned and linked to a thread about someone from PostGIS asking for this, which as far as I understand has quite a large user base. You've either ignored that or think the number of people using PostGIS is negligible. As far as I understand your argument, it's about there being a possibility a group of people existing who rely on the aggregation order being defined without an ORDER BY in the aggregate function. Unfortunately, It appears from the responses from many of the other's who voiced an opinion about this is that there is no shortage of other reasons why relying on values being aggregated in a defined order without an ORDER BY in the aggregate function arguments is a dangerous assumption to make. Several reasons were listed why this is undefined and I mentioned the great lengths we'd need to go to do make the order more defined without an explicit ORDER BY, and I still imagine I'd have missed some of the reasons. I imagine the number of people which rely on the order being defined without an ORDER BY is diminishing each release as we add parallelism support for more node types. Both Andres and I agree that it's a shame to block useful optimisations due to the needs of a small diminishing group of people who are not very good at reading our carefully crafted documentation... for the past 8 years [1]. I imagine this small group of people, if they do exist, must slowly be waking up to the fact that we've been devising new ways to ruin their query results for many years now. It seems pretty strange for us to call a truce now after we've been wreaking havoc on this group for so many releases... I really do hope this part is not true, but if such a person appeared in -novice or -general asking for help, we'd be telling them to add an ORDER BY, and we'd be quoting the 8-year-old line in the documents which states that we make no guarantees in this area in the absence of an ORDER BY. If they truly have no other choice then we might consider suggesting they may get what they want if they disable parallel query, and we'd probably rhyme off a few other reasons why it might suddenly break on them again. [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=doc/src/sgml/func.sgml;h=7d6125c97e5203c9d092ceec3aaf351c1a5fcf1b;hp=f2906cc82230150f72353609e9c831e90dcc10ca;hb=34d26872ed816b299eef2fa4240d55316697f42d;hpb=6a6efb964092902bf53965649c3ed78b1868b37e -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services