On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule <pavel.steh...@gmail.com> wrote:
> I am sending a prototype implementation of functions median and
> percentile. This implementation is very simple and I moved it to
> contrib for this moment - it is more easy maintainable. Later I'll
> move it to core.

So if the entire result set fits in memory it would be nice to use the
O(n) Quickselect algorithm -- which would only be a small change to
the existing Quicksort code -- instead of sorting the entire set. That
should be possible both for percentile() and median(). It would be
good to generalize the tuplesort api sufficiently so that you can
implement this as a contrib module even if we eventually decide to
integrate it in core. It's probably worth having two copies of the
sort code, one for Quickselect and one for Quicksort just for speed,
though I suppose it's worth benchmarking.

But I'm not aware of a generalization of tapesort to allow O(n)
selection with the sequential i/o properties of tapesort. It would be
especially interesting, probably more so than the in-memory case.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to