On Thu, Aug 11, 2011 at 10:21 AM, Heikki Linnakangas < heikki.linnakan...@enterprisedb.com> wrote:
> Split of an internal node works like this: > > 1. Gather all the existing tuples on the page, plus the new tuple being > inserted. > 2. Call picksplit on the tuples, to divide them into pages > 3. Go through all tuples on the buffer associated with the page, and divide > them into buffers on the new pages. This is done by calling penalty function > on each buffered tuple. > > I wonder if it would be better for index quality to pass the buffered > tuples to picksplit in the 2nd step, so that they too can affect the split > decision. Maybe it doesn't make much difference in practice.. I had this idea. But: 1) Buffer contain much more tuples than page plus new tuple. 2) Picksplit method can easily be quadratic for example. Let's see the complexity of picksplit algorithms: 1) geometric datatypes (point, box etc) - O(n) (BTW, I have serious doubts about it, i.e. O(n*log(n)) algorithm can be in times better in many cases) 2) pg_trgm and fts - O(n^2) 3) seg - O(n*log(n)) 4) cube - O(n^2) Thus, I believe such feature should be an optional. We can try it as add-on patch. ------ With best regards, Alexander Korotkov.