Re: [HACKERS] The Future of Aggregation

2015-06-14 Thread David Rowley
On 12 June 2015 at 23:57, David Rowley wrote: > On 11 June 2015 at 01:39, Kevin Grittner wrote: > >> >> One question that arose in my mind running this was whether might >> be able to combine sum(x) with count(*) if x was NOT NULL, even >> though the arguments don't match. It might not be worth

Re: [HACKERS] The Future of Aggregation

2015-06-12 Thread Kevin Grittner
David Rowley wrote: >> I am a little curious what sort of machine you're running on, >> because my i7 is much slower. I ran a few other tests with your >> table for perspective. > > Assert enabled build? Mystery solved. Too often I forget to reconfigure with optimization and without cassert f

Re: [HACKERS] The Future of Aggregation

2015-06-12 Thread David Rowley
On 11 June 2015 at 01:39, Kevin Grittner wrote: > David Rowley wrote: > > > > /* setup */ create table millionrowtable as select > > generate_series(1,100)::numeric as x; > > /* test 1 */ SELECT sum(x) / count(x) from millionrowtable; > > /* test 2 */ SELECT avg(x) from millionrowtable; > >

Re: [HACKERS] The Future of Aggregation

2015-06-11 Thread Tom Lane
Robert Haas writes: > On Tue, Jun 9, 2015 at 11:00 AM, Alvaro Herrera > wrote: >> Uh, this also requires serialization and deserialization of non- >> finalized transition state, no? > A bunch of this stuff does, but I recently had a Brilliant Insight: we > don't need to add a new method for seri

Re: [HACKERS] The Future of Aggregation

2015-06-11 Thread Robert Haas
On Tue, Jun 9, 2015 at 11:00 AM, Alvaro Herrera wrote: > Uh, this also requires serialization and deserialization of non- > finalized transition state, no? A bunch of this stuff does, but I recently had a Brilliant Insight: we don't need to add a new method for serializing and deserializing trans

Re: [HACKERS] The Future of Aggregation

2015-06-10 Thread Kevin Grittner
David Rowley wrote: > On 10 June 2015 at 02:52, Kevin Grittner wrote: >> David Rowley wrote: >>> The idea I discussed in the link in item 5 above gets around this >>> problem, but it's a perhaps more surprise filled implementation >>> as it will mean "select avg(x),sum(x),count(x) from t" is >>>

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Jim Nasby
On 6/9/15 9:52 AM, Kevin Grittner wrote: Yeah, I think we want to preserve the ability of count() to have a simple state, and implement dependent aggregates as discussed in the other thread -- where (as I understood it) having sum(x), count(x), and avg(x) in a query would avoid the row-by-row wor

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Alvaro Herrera
David Rowley wrote: > On 10 June 2015 at 03:25, Alvaro Herrera wrote: > > > Kevin Grittner wrote: > > > Alvaro Herrera wrote: > > > > > > Uh, this also requires serialization and deserialization of non- > > > > finalized transition state, no? > > > > > > For that sort of optimization to incremen

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread David Rowley
On 10 June 2015 at 03:25, Alvaro Herrera wrote: > Kevin Grittner wrote: > > Alvaro Herrera wrote: > > > > Uh, this also requires serialization and deserialization of non- > > > finalized transition state, no? > > > > For that sort of optimization to incremental maintenance of > > materialized vi

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread David Rowley
On 10 June 2015 at 02:52, Kevin Grittner wrote: > David Rowley wrote: > The idea I discussed in the link in item 5 above gets around this > > problem, but it's a perhaps more surprise filled implementation > > as it will mean "select avg(x),sum(x),count(x) from t" is > > actually faster than "s

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Tomas Vondra
On 06/09/15 17:27, Andres Freund wrote: On 2015-06-09 17:19:33 +0200, Tomas Vondra wrote: ... and yet another use case for 'aggregate state combine' that I just remembered about is grouping sets. What GROUPING SET (ROLLUP, ...) do currently is repeatedly sorting the input, once for each groupi

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Andres Freund
On 2015-06-09 17:19:33 +0200, Tomas Vondra wrote: > ... and yet another use case for 'aggregate state combine' that I just > remembered about is grouping sets. What GROUPING SET (ROLLUP, ...) do > currently is repeatedly sorting the input, once for each grouping. Actually, that's not really what h

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Alvaro Herrera
Kevin Grittner wrote: > Alvaro Herrera wrote: > > Uh, this also requires serialization and deserialization of non- > > finalized transition state, no? > > For that sort of optimization to incremental maintenance of > materialized views (when we get there), yes. That will be one of > many issues

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Tomas Vondra
On 06/09/15 16:10, Tomas Vondra wrote: Hi, On 06/09/15 12:58, David Rowley wrote: ... Items 1-4 above I believe require support of "Aggregate State Combine Support" -> https://commitfest.postgresql.org/5/131/ which I believe will need to be modified to implement complex database types to b

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Kevin Grittner
Alvaro Herrera wrote: > Kevin Grittner wrote: >> David Rowley wrote: >>> 5. Dependant Aggregates >>> >>> Item 5 makes items 1-4 a bit more complex as with this item >>> there's opportunity for very good performance improvements by >>> allowing aggregates like AVG(x) also perform all the required

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Alvaro Herrera
Kevin Grittner wrote: > David Rowley wrote: > > 5. Dependant Aggregates > > > Item 5 makes items 1-4 a bit more complex as with this item > > there's opportunity for very good performance improvements by > > allowing aggregates like AVG(x) also perform all the required > > work to allow SUM(x) a

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Kevin Grittner
David Rowley wrote: > It appears to me that there's quite a few new features and > optimisations on the not too distant horizon which will require > adding yet more fields into pg_aggregate. > > These are things along the lines of: > 3. Auto-updating Materialized views (ones which contain aggreg

Re: [HACKERS] The Future of Aggregation

2015-06-09 Thread Tomas Vondra
Hi, On 06/09/15 12:58, David Rowley wrote: These are things along the lines of: 1. Parallel Aggregation (computes each aggregate state in parallel worker processes and then merges these states in serial mode) 2. Aggregate push-down / Aggregate before join (requires passing partially computed a

[HACKERS] The Future of Aggregation

2015-06-09 Thread David Rowley
It appears to me that there's quite a few new features and optimisations on the not too distant horizon which will require adding yet more fields into pg_aggregate. These are things along the lines of: 1. Parallel Aggregation (computes each aggregate state in parallel worker processes and then me