I wrote: > ... an invertible aggregate may require a more > complex transition state data structure --- in particular, if you're > forced to go from a pass-by-value to a pass-by-reference data type, right > there you are going to take a big hit in aggregate performance, and there > is no way for the forward transition function to avoid it. The patch > has in fact already done that to a couple of basic aggregates like > sum(int4). Has anyone bothered to test what side-effects that has on > non-windowed aggregation performance?
As a quick check, I compared aggregation performance in HEAD, non-assert builds, with and without --disable-float8-byval on a 64-bit machine. So this tests replacing a pass-by-val transition datatype with a pass-by-ref one without any other changes. There's essentially no difference in performance of sum(int4), AFAICT, but that's because int4_sum goes out of its way to cheat and avoid palloc overhead. I looked to the bit_and() aggregates to see what would happen to an aggregate not thus optimized. As expected, int4 and int8 bit_and are just about the same speed if int8 is pass by value ... but if it's pass by ref, the int8 case is a good 60% slower. So added palloc overhead, at least, is a no-go. I see that the patched version of sum(int4) avoids that trap, but nonetheless it's replaced a pretty cheap transition function with a less cheap function, namely the function previously used for avg(int4). A quick test says that avg(int4) is about five percent slower than sum(int4), so that's the kind of hit we'd be taking on non-windowed aggregations if we do it like this. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers