On Wed, Dec 17, 2014 at 6:05 PM, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote: > > Simon, > > Its concept is good to me. I think, the new combined function should be > responsible to take a state data type as argument and update state object > of the aggregate function. In other words, combined function performs like > transition function but can update state object according to the summary > of multiple rows. Right? > > It also needs some enhancement around Aggref/AggrefExprState structure to > inform which function shall be called on execution time. > Combined functions are usually no-thank-you. AggrefExprState updates its > internal state using transition function row-by-row. However, once someone > push-down aggregate function across table joins, combined functions have > to be called instead of transition functions. > I'd like to suggest Aggref has a new flag to introduce this aggregate > expects > state object instead of scalar value. > > Also, I'd like to suggest one other flag in Aggref not to generate final > result, and returns state object instead. > > > So are you proposing not calling transfuncs at all and just use combined functions?
That sounds counterintuitive to me. I am not able to see why you would want to avoid transfns totally even for the case of pushing down aggregates that you mentioned. >From Simon's example mentioned upthread: PRE-AGGREGATED PLAN Aggregate -> Join -> PreAggregate (doesn't call finalfn) -> Scan BaseTable1 -> Scan BaseTable2 finalfn wouldnt be called. Instead, combined function would be responsible for getting preaggregate results and combining them (unless of course, I am missing something). Special casing transition state updating in Aggref seems like a bad idea to me. I would think that it would be better if we made it more explicit i.e. add a new node on top that does the combination (it would be primarily responsible for calling combined function). Not a good source of inspiration, but seeing how SQL Server does it (Exchange operator + Stream Aggregate) seems intuitive to me, and having combination operation as a separate top node might be a cleaner way. I may be wrong though. Regards, Atri