Re: [HACKERS] Performance gain from reduction of GROUP BY memory

Simon Riggs Tue, 30 Aug 2005 00:51:07 -0700

On Mon, 2005-08-29 at 20:25 -0400, Tom Lane wrote:
> Simon Riggs <[EMAIL PROTECTED]> writes:
> > I notice that Neil's patch regarding reducing the number of memory
> > allocations during aggregation operations isn't mentioned. It was
> > originally discussed in 8.0beta (2-3?) time.
> 
> > What happened there?
> > - patch not committed in the end
> > - committed but not mentioned, as a dropped item
> > - committed but not mentioned, since part of a larger patch
> 
> Are you speaking of these patches?


Yes, those look like the ones I mentioned.

Those seem to have a useful performance improvement?

At very least, the change in Aggregate function API should be mentioned,
no?

> 2005-04-06 19:56  neilc
> 
>       * src/backend/utils/adt/: float.c, numeric.c: Apply the "nodeAgg"
>       optimization to more of the builtin transition functions. This
>       patch optimizes int2_sum(), int4_sum(), float4_accum() and
>       float8_accum() to avoid needing to copy the transition function's
>       state for each input tuple of the aggregate. In an extreme case
>       (e.g. SELECT sum(int2_col) FROM table where table has a single
>       column), it improves performance by about 20%. For more complex
>       queries or tables with wider rows, the relative performance
>       improvement will not be as significant.
> 
> 2005-04-04 19:50  neilc
> 
>       * src/backend/utils/adt/numeric.c: This patch changes
>       int2_avg_accum() and int4_avg_accum() use the nodeAgg performance
>       hack Tom introduced recently. This means we can avoid copying the
>       transition array for each input tuple if these functions are
>       invoked as aggregate transition functions.
>       
>       To test the performance improvement, I created a 1 million row
>       table with a single int4 column. Without the patch, SELECT avg(col)
>       FROM table took about 4.2 seconds (after the data was cached); with
>       the patch, it took about 3.2 seconds. Naturally, the performance
>       improvement for a less trivial query (or a table with wider rows)
>       would be relatively smaller.
> 
> 2005-03-12 15:25  tgl
> 
>       * contrib/intagg/int_aggregate.c,
>       contrib/intagg/int_aggregate.sql.in, doc/src/sgml/xaggr.sgml,
>       doc/src/sgml/xfunc.sgml, src/backend/executor/nodeAgg.c,
>       src/backend/utils/adt/int8.c: Adjust the API for aggregate function
>       calls so that a C-coded function can tell whether it is being used
>       as an aggregate or not.  This allows such a function to avoid
>       re-pallocing a pass-by-reference transition value; normally it
>       would be unsafe for a function to scribble on an input, but in the
>       aggregate case it's safe to reuse the old transition value.  Make
>       int8inc() do this.  This gets a useful improvement in the speed of
>       COUNT(*), at least on narrow tables (it seems to be swamped by I/O
>       when the table rows are wide).  Per a discussion in early December
>       with Neil Conway.  I also fixed int_aggregate.c to check this,
>       thereby turning it into something approaching a supportable
>       technique instead of being a crude hack.

I'll search CVS directly next time. Thanks.

Best Regards, Simon Riggs


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] Performance gain from reduction of GROUP BY memory

Reply via email to