On Mon, Dec 16, 2013 at 9:39 PM, Heikki Linnakangas <hlinnakan...@vmware.com > wrote:
> > There's another technique we could use which doesn't need a negative > transition function, assuming the order you feed the values to the aggreate > function doesn't matter: keep subtotals. For example, if the window first > contains values 1, 2, 3, 4, you calculate 3 + 4 = 7, and then 1 + 2 + 7 = > 10. Next, 1 leaves the window, and 5 enters it. Now you calculate 2 + 7 + > 5 = 14. By keeping the subtotal (3 + 4 = 7) around, you saved one addition > compared to calculating 2 + 3 + 4 + 5 from scratch. > > The negative transition function is a lot simpler and faster for count(*) > and integer operations, so we probably should implement that anyway. But > the subtotals technique could be very useful for other data types. > > - Heikki > That's quite interesting. I guess we would need another flag in pg_aggregate to mark if the order of the tuples matters, string_agg would be an example of one that would have to skip this. At least for ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING if the aggregate order did not matter than it would likely be quite efficient just to aggregate from the bottom up and materialise the results at each tuple and store it in the tuple store, then just use that materialised value when that tuple is processed. This method won't work when the frame base is not fixed. I had been thinking ahead on how to improve MIN and MAX cases too. I came up with something called "tuple store indexes" that could be build as binary search trees with a composite index on the tuple position and the aggregate's sort operator... Something similar to how the following query could use an index on (id,value) to calculate max() select max(value) from test where id between 1 and 100; It's certainly not something for this patch, but it was an idea I came up with which I think would be possible without adding any more columns to pg_aggregate. Regards David Rowley