On Thu, Aug 11, 2011 at 10:21 AM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> Split of an internal node works like this:
>
> 1. Gather all the existing tuples on the page, plus the new tuple being
> inserted.
> 2. Call picksplit on the tuples, to divide them into pages
> 3. Go through all tuples on the buffer associated with the page, and divide
> them into buffers on the new pages. This is done by calling penalty function
> on each buffered tuple.
>
> I wonder if it would be better for index quality to pass the buffered
> tuples to picksplit in the 2nd step, so that they too can affect the split
> decision. Maybe it doesn't make much difference in practice..


I had this idea. But:
1) Buffer contain much more tuples than page plus new tuple.
2) Picksplit method can easily be quadratic for example.

Let's see the complexity of picksplit algorithms:
1) geometric datatypes (point, box etc) - O(n) (BTW, I have serious doubts
about it, i.e. O(n*log(n)) algorithm can be in times better in many cases)
2) pg_trgm and fts - O(n^2)
3) seg - O(n*log(n))
4) cube - O(n^2)

Thus, I believe such feature should be an optional. We can try it as add-on
patch.

------
With best regards,
Alexander Korotkov.

Reply via email to