Re: [HACKERS] WIP: Fast GiST index build

Heikki Linnakangas Mon, 27 Jun 2011 07:35:09 -0700

On 27.06.2011 13:45, Alexander Korotkov wrote:

I've added information about testing on some real-life dataset to wiki page.
This dataset have a speciality: data is ordered inside it. In this case
tradeoff was inverse in comparison with expectations about "fast build"
algrorithm. Index built is longer but index quality is significantly better.
I think high speed of regular index built is because sequential inserts are
into near tree parts. That's why number of actual page reads and writes is
low. The difference in tree quality I can't *convincingly explain now.*
I've also maked tests with shuffled data of this dataset. In this case
results was similar to random generated data.


Once again, interesting results.

The penalty function is called whenever a tuple is routed to the nextlevel down, and the final tree has the same depth with and without thepatch, so I would expect the number of penalty calls to be roughly thesame. But clearly there's something wrong with that logic; can youexplain in layman's terms why the patch adds so many gist penalty calls?And how many calls does it actually add, can you gather some numbers onthat? Any ides on how to mitigate that, or do we just have to live withit? Or maybe use some heuristic to use the existing insertion methodwhen the patch is not expected to be helpful?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Fast GiST index build

Reply via email to