Re: [HACKERS] Combining Aggregates

Robert Haas Mon, 21 Dec 2015 04:31:30 -0800

On Mon, Dec 21, 2015 at 4:02 AM, David Rowley
<david.row...@2ndquadrant.com> wrote:
>  Ok, so it seems that my mindset was not in parallel process space when I
> was thinking about this.  I think having the pointer in the Tuple is
> probably going to be fine for this multiple stage aggregation when that's
> occurring in a single backend process, but obviously if the memory that the
> pointer points to belongs to a worker process in a parallel aggregate
> situation, then bad things will happen.
>
> Now, there has been talk of this previously, on various threads, but I don't
> believe any final decisions were made on how exactly it should be done. At
> the moment I plan to make changes as follows:
>
>  Add 3 new columns to pg_aggregate, aggserialfn, aggdeserialfn and
> aggserialtype These will only be required when aggtranstype is INTERNAL.
> Perhaps we should disallow CREATE AGGREAGET from accepting them for any
> other type... The return type of aggserialfn should be aggserialtype, and it
> should take a single parameter of aggtranstype. aggdeserialfn will be the
> reverse of that.
> Add a new bool field to nodeAgg's state named serialStates. If this is field
> is set to true then when we're in finalizeAgg = false mode, we'll call the
> serialfn on the agg state instead of the finalfn. This will allow the
> serialized state to be stored in the tuple and sent off to the main backend.
> The combine agg node should also be set to serialStates = true, so that it
> knows to deserialize instead of just assuming that the agg state is of type
> aggtranstype.
>
> I believe this should allow us to not cause any performance regressions by
> moving away from INTERNAL agg states. It should also be very efficient for
> internal states such as Int8TransTypeData, as this struct merely has 2 int64
> fields which should be very simple to stuff into a bytea varlena type. We
> don't need to mess around converting the ->count and ->sum into a strings or
> anything.
>
> Then in order for the planner to allow parallel aggregation all aggregates
> must:
>
> Not have a DISTINCT or ORDER BY clause
> Have a combinefn
> If aggtranstype = INTERNAL, must have a aggserialfn and aggdeserialfn.
>
> We can relax the requirement on 3 if we're using 2-stage aggregation, but
> not parallel aggregation.
>
> Any objections, or better ideas?


Can we use Tom's expanded-object stuff instead of introducing
aggserialfn and aggdeserialfn?  In other words, if you have a
aggtranstype = INTERNAL, then what we do is:

1. Create a new data type that represents the transition state.
2. Use expanded-object notation for that data type when we're just
within a single process, and flatten it when we need to send it
between processes.

One thing to keep in mind is that we also want to be able to support a
plan that involves having one or more remote servers do partial
aggregation, send us the partial values, combine them across servers
and possibly also with locally computed-values, and the finalize the
aggregation.  So it would be nice if there were a way to invoke the
aggregate function from SQL and get a transition value back rather
than a final value.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Combining Aggregates

Reply via email to