Hannu Krosing wrote: > ?hel kenal p?eval, K, 2007-03-21 kell 17:25, kirjutas Bruce Momjian: > > Added to TODO: > > > > o During index creation, pre-sort the tuples to improve build speed > > > > http://archives.postgresql.org/pgsql-hackers/2007-03/msg01199.php > > Maybe the TODO text should mention, that it is about HASH indexes ?
It is in the HASH section of the TODO list. --------------------------------------------------------------------------- > > > --------------------------------------------------------------------------- > > > > Tom Lane wrote: > > > I wrote: > > > > I'm not sure if this has been discussed before, but I suddenly realized > > > > while responding to the above message that the reason for the awful > > > > performance is pretty obvious: hashbuild starts with a minimum-size > > > > index (two buckets) and repeatedly splits buckets as insertions are > > > > done, exactly the same as ordinary dynamic growth of the index would do. > > > > This means that for an N-row table, approximately N/entries-per-bucket > > > > splits need to occur during index build, which results in roughly O(N^2) > > > > performance because we have to reprocess already-inserted entries over > > > > and over. > > > > > > Well, unfortunately this theory seems to be all wet. Given that the > > > bucket loading is reasonably even, the time to split a bucket is about > > > constant and so there's no O(N^2) effect. (The multiplier hidden inside > > > O(N) is pretty awful, but it doesn't change with N.) > > > > > > The real reason why performance falls off a cliff for building large > > > hash indexes seems to be much harder to fix: basically, once the size > > > of your index exceeds working memory, it's nap time. Given that the > > > incoming data has randomly distributed hash values, each bucket is about > > > as likely to be touched next as any other; there is no locality of > > > access and so the "working set" is the same size as the index. Once it > > > doesn't fit in RAM anymore you're into swap hell. > > > > > > The only way I can see to fix that is to try to impose some locality of > > > access during the index build. This is not impossible: for example, > > > given a choice for the number of buckets, we could sort all the index > > > tuples by hashed bucket number and then start inserting. btree does a > > > preliminary sort, and its index build times are way more reasonable > > > than hash's currently are, so the cost of the sort isn't outrageous. > > > (I note this is mainly because we know how to do sorting with locality > > > of access...) Before we start inserting we will know exactly how many > > > tuples there are, so we can pre-create the right number of buckets and > > > be sure that no on-the-fly splits will be needed for the rest of the > > > build. If we guessed wrong about the number of buckets there will be > > > some places in the process where we concurrently insert into several > > > buckets not just one, or perhaps come back to a bucket that we touched > > > earlier, but that's still maintaining plenty of locality of access. > > > > > > This is looking like more work than I want to do in the near future, > > > but I thought I'd put it into the archives for someone to tackle. > > > Bruce, would you add a TODO item linking to this: > > > > > > * Improve hash index build time by sorting > > > > > > regards, tom lane > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 9: In versions below 8.0, the planner will ignore your desire to > > > choose an index scan if your joining column's datatypes do not > > > match > > > -- > ---------------- > Hannu Krosing > Database Architect > Skype Technologies O? > Akadeemia tee 21 F, Tallinn, 12618, Estonia > > Skype me: callto:hkrosing > Get Skype for free: http://www.skype.com > > NOTICE: This communication contains privileged or other confidential > information. If you have received it in error, please advise the sender > by reply email and immediately delete the message and any attachments > without copying or disclosing the contents. > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: You can help support the PostgreSQL project by donating at > > http://www.postgresql.org/about/donate -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate