On 08/21/2014 01:28 PM, Andrew Gierth wrote:

A progress update:

  Atri>  We envisage that handling of arbitrary grouping sets will be
  Atri> best done by having the planner generating an Append of
  Atri> multiple aggregation paths, presumably with some way of moving
  Atri> the original input path to a CTE. We have not really explored
  Atri> yet how hard this will be; suggestions are welcome.

This idea was abandoned.

Instead, we have implemented full support for arbitrary grouping sets
by means of a chaining system:

explain (verbose, costs off) select four, ten, hundred, count(*) from onek 
group by cube(four,ten,hundred);

                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------
  GroupAggregate
    Output: four, ten, hundred, count(*)
    Grouping Sets: (onek.hundred, onek.four, onek.ten), (onek.hundred, 
onek.four), (onek.hundred), ()
    ->  Sort
          Output: four, ten, hundred
          Sort Key: onek.hundred, onek.four, onek.ten
          ->  ChainAggregate
                Output: four, ten, hundred
                Grouping Sets: (onek.ten, onek.hundred), (onek.ten)
                ->  Sort
                      Output: four, ten, hundred
                      Sort Key: onek.ten, onek.hundred
                      ->  ChainAggregate
                            Output: four, ten, hundred
                            Grouping Sets: (onek.four, onek.ten), (onek.four)
                            ->  Sort
                                  Output: four, ten, hundred
                                  Sort Key: onek.four, onek.ten
                                  ->  Seq Scan on public.onek
                                        Output: four, ten, hundred
(20 rows)

Uh, that's ugly. The EXPLAIN out I mean; as an implementation detail chaining the nodes might be reasonable. But the above gets unreadable if you have more than a few grouping sets.

The ChainAggregate nodes use a tuplestore to communicate with the
GroupAggregate node at the top of the chain; they pass through input
tuples unchanged, and write aggregated result rows to the tuplestore,
which the top node then returns once it has finished its own result.

Hmm, so there's a "magic link" between the GroupAggregate at the top and all the ChainAggregates, via the tuplestore. That may be fine, we have special rules in passing information between bitmap scan nodes too.

But rather than chain multiple ChainAggregate nodes, how about just doing all the work in the top GroupAggregate node?

  Atri> At this point we are more interested in design review rather
  Atri> than necessarily committing this patch in its current state.

This no longer applies; we expect to post within a day or two an
updated patch with full functionality.

Ok, cool

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to