Hello, == Proposal - GROUPING SETS ==
a grouping set feature allows multiple grouping clauses in one query. Result of grouping sets is union of results each groupby clause's result. create table t(a int, b int); insert into t values(10,20); insert into t values(30,40); select a, b from t group by grouping sets(a, b); is same as: select a, NULL from t group by a union all select NULL, b from t group by b; Note: all ungrouped vars are transformed to NULL Groupby clause should contains cube and rollup lists. These are transformed to grouping sets via transformed rules: create table t1(a int, b int, c int) group by rollup(a, b, c) -> group by grouping sets((a,b,c), (a,b), (a), ()) group by cube(a,b,c) -> group by grouping sets ((a,b,c),(a,b),(a,c), (a), (b,c), (b), (c), ()) Groupby clause or grouping sets should contains more sets. Result is multiplication of these sets: group by grouping sets(a), grouping sets(b, (b,c), ()) -> group by grouping sets((a,b), (a,b,c), (a)) When grouping sets are used, then we should to use grouping and grouping_id functions. Function grouping returns 1 when parameter is in current group set, else returns 0. Function grouping_id returns value as grouping_id(a,b,c) = to_dec(to_bin(grouping(a) || grouping(b) || grouping(c))) postgres=# select * from t; a | b | c ----+----+---- 10 | 20 | 30 10 | 20 | 30 (2 rows) postgres=# select a,b,c from t group by grouping sets(a,b,c); a | b | c ----+----+---- 10 | | | 20 | | | 30 (3 rows) postgres=# select a,b,c, grouping(a), grouping(b), grouping(c), grouping_id(a,b,c) from t group by grouping sets(a,b,c); a | b | c | grouping | grouping | grouping | grouping_id ----+----+----+----------+----------+----------+------------- 10 | | | 1 | 0 | 0 | 4 | 20 | | 0 | 1 | 0 | 2 | | 30 | 0 | 0 | 1 | 1 (3 rows) some real sample: create table report( inserted date, locality varchar, name varchar, c int); postgres=# copy report to stdout; 2008-10-10 Prague Milk 10 2008-10-11 Prague Milk 12 2008-10-10 Prague Rum 2 2008-10-11 Prague Rum 6 2008-10-10 Berlin Milk 8 2008-10-11 Berlin Milk 14 2008-10-10 Berlin Beer 20 2008-10-11 Berlin Beer 25 postgres=# select * from report; inserted | locality | name | c ------------+----------+------+---- 2008-10-10 | Prague | Milk | 10 2008-10-11 | Prague | Milk | 12 2008-10-10 | Prague | Rum | 2 2008-10-11 | Prague | Rum | 6 2008-10-10 | Berlin | Milk | 8 2008-10-11 | Berlin | Milk | 14 2008-10-10 | Berlin | Beer | 20 2008-10-11 | Berlin | Beer | 25 (8 rows) postgres=# select inserted, locality, name, sum(c) from report group by grouping sets(inserted, locality, name); inserted | locality | name | sum ------------+----------+------+----- 2008-10-10 | | | 40 2008-10-11 | | | 57 | Berlin | | 67 | Prague | | 30 | | Milk | 44 | | Rum | 8 | | Beer | 45 (7 rows) postgres=# select inserted, locality, name, sum(c) from report group by grouping sets(inserted, (locality, name)); inserted | locality | name | sum ------------+----------+------+----- 2008-10-10 | | | 40 2008-10-11 | | | 57 | Prague | Milk | 22 | Berlin | Milk | 22 | Berlin | Beer | 45 | Prague | Rum | 8 (6 rows) postgres=# select inserted, locality, name, sum(c) from report group by name, grouping sets(inserted, locality); inserted | locality | name | sum ------------+----------+------+----- 2008-10-11 | | Rum | 6 2008-10-10 | | Rum | 2 2008-10-10 | | Beer | 20 2008-10-11 | | Beer | 25 2008-10-10 | | Milk | 18 2008-10-11 | | Milk | 26 | Prague | Rum | 8 | Berlin | Beer | 45 | Berlin | Milk | 22 | Prague | Milk | 22 (10 rows) == Implementation == Grouping sets introduce a new concept into SQL. One readed tuple is multiple used. It's similar with WITH clause. It's little bit dificult implement it for current PostgreSQL's executor. In my prototype I used aux node Feeder. This node should to hold only one tuple. I add new method for Agg node, that process only one input tuple: for (;;) { tuple = feeder->execute(grouping_sets->lefttree); foreach(l, grouping_sets->subplans) { lfirst(l)->process(tuple); } if (is_null(tuple)) break; } It supports only HASH Agg nodes. For non hash agg should be used other method (currently it isn't supported). postgres=# explain verbose select inserted, locality, name, sum(c) from report group by name,grouping sets(inserted, locality); QUERY PLAN ----------------------------------------------------------------- Grouping Sets (cost=12.00..35.00 rows=400 width=68) Output: NULL::date, locality, name, sum(c) -> Seq Scan on report (cost=0.00..18.00 rows=800 width=68) Output: inserted, locality, name, c -> HashAggregate (cost=6.00..8.50 rows=200 width=68) Output: inserted, NULL::character varying, name, sum(c) -> Feeder (cost=0.00..0.00 rows=800 width=68) Output: inserted, locality, name, c -> HashAggregate (cost=6.00..8.50 rows=200 width=68) Output: NULL::date, locality, name, sum(c) -> Feeder (cost=0.00..0.00 rows=800 width=68) Output: inserted, locality, name, c (12 rows) After work on prototype I don't see any problems in executor or parser. I expect some dificulties in planner. With grouping sets grouping_planner procedure will be much more complex: 1. targetlist and groupclause will be list of list, 2. we should to repeat estimation of NumGroups for each group set (it's should be shared with CTE feature??). == Parser problems == 1. The identifier cube is used in contrib cube. Solution: CUBE '(' ... ')' generates funcCall and it is transformed to grouping sets only in groupby clause later. I invite any ideas, notes and help with documentation. Regards Pavel Stehule -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers