On Wed, Jun 12, 2019 at 06:34:27PM -0700, Peter Geoghegan wrote:
> > I was wrong.  I was thinking of this commit:
> >
> >     commit d2086b08b0
> >     Author: Alexander Korotkov <akorot...@postgresql.org>
> >     Date:   Sat Jul 28 00:31:40 2018 +0300
> >
> >     Reduce path length for locking leaf B-tree pages during insertion
> 
> > > If you had to cut one thing from this list, then I would suggest that
> > > it be this item. It's nice, but it's also very obvious, which makes it
> > > hard to explain.
> 
> > Right.  The commit mentioned a 4.5x speedup in a rare benchmark, so I
> > added it lower on the list.
> 
> My remark about cutting an item referred to a lesser item that I
> worked on (the 'Add nbtree high key "continuescan" optimization'
> commit), not Alexander independent B-Tree work. I think that
> Alexander's optimization is also quite effective. Though FWIW the 4.5x
> improvement concerned a case involving lots of duplicates...cases with
> a lot of duplicates will be far far better in Postgres 12. (I never
> tested my patch without Alexander's commit, since it went in early in
> the v12 cycle.)

OK, good to know.

> > Attached is an updated patch.  I might have missed something, but I
> > think it might be close.
> 
> This looks great. I do have a few things:
> 
> * I would put "Improve performance and space utilization of btree
> indexes with many duplicates" first (before "Allow multi-column btree
> indexes to be smaller"). I think that this is far more common than we
> tend to assume, and is also where the biggest benefits are.

OK, done, I was wondering about that.

> * The wording of the "many duplicates" item itself is almost perfect,
> though the "...and inefficiency when VACUUM needed to find a row for
> removal" part seems a bit off -- this is really about the
> effectiveness of VACUUM, not the speed at which the VACUUM completes
> (it's a bit faster, but that's not that important). Perhaps that part
> should read: "...and often failed to efficiently recycle space made
> available by VACUUM". Something like that.

Ah, I see what you mean --- recycle entire pages.  I have updated the
patch.

> * The "Allow multi-column btree indexes to be smaller" item is about
> both suffix truncation, and about the "Split after new tuple"
> optimization. I think that that makes it more complicated than it
> needs to be. While the improvements that we saw with TPC-C on account
> of the "Split after new tuple" optimization were nice, I doubt that
> users will be looking out for it. I would be okay if you dropped any
> mention of the "Split after new tuple" optimization, in the interest
> of making the description more useful to users. We can just lose that.

OK, done.

> * Once you simplify the item by making it all about suffix truncation,
> it would make sense to change the single line summary to "Reduce the
> number of branch blocks needed for multi-column indexes". Then go on
> to talk about how we now only store those columns that are necessary
> to guide index scans in tuples stored in branch pages (we tend to call
> branch pages internal pages, but branch pages seems friendlier to me).
> Note that the user docs of other database systems reference these
> details, even in their introductory material on how B-Tree indexes
> work. The term "suffix truncation" isn't something users have heard
> of, and we shouldn't use it here, but the *idea* of suffix truncation
> is very well established. As I mentioned, it matters for things like
> covering indexes (indexes that are designed to be used by index-only
> scans, which are not necessarily INCLUDE indexes).

OK, I mentioned something about increased locality now.  Patch attached.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +
diff --git a/doc/src/sgml/release-12.sgml b/doc/src/sgml/release-12.sgml
new file mode 100644
index 4a6c989..fcc49ff
*** a/doc/src/sgml/release-12.sgml
--- b/doc/src/sgml/release-12.sgml
*************** Author: Tom Lane <t...@sss.pgh.pa.us>
*** 606,627 ****
  
        <listitem>
  <!--
! Author: Alexander Korotkov <akorot...@postgresql.org>
! 2018-07-28 [d2086b08b] Reduce path length for locking leaf B-tree pages during 
  Author: Peter Geoghegan <p...@bowt.ie>
  2019-03-25 [f21668f32] Add "split after new tuple" nbtree optimization.
  -->
  
         <para>
!         Improve speed of btree index insertions (Peter Geoghegan,
!         Alexander Korotkov)
         </para>
  
         <para>
!         The new code improves the space-efficiency of page splits,
!         reduces locking overhead, and gives better performance for
!         <command>UPDATE</command>s and <command>DELETE</command>s on
!         indexes with many duplicates.
         </para>
        </listitem>
  
--- 606,669 ----
  
        <listitem>
  <!--
! Author: Peter Geoghegan <p...@bowt.ie>
! 2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column.
! Author: Peter Geoghegan <p...@bowt.ie>
! 2019-03-20 [fab250243] Consider secondary factors during nbtree splits.
  Author: Peter Geoghegan <p...@bowt.ie>
  2019-03-25 [f21668f32] Add "split after new tuple" nbtree optimization.
  -->
  
         <para>
!         Improve performance and space utilization of btree indexes with
!         many duplicates (Peter Geoghegan, Heikki Linnakangas)
         </para>
  
         <para>
!         Previously, duplicate index entries were stored unordered within
!         their duplicate groups.  This caused overhead during index
!         inserts, wasted space due to excessive page splits, and reduced
!         <command>VACUUM</command>'s ability to recycle entire pages.
!         Duplicate index entries are now sorted in heap-storage order.
!        </para>
! 
!        <para>
!         Indexes <application>pg_upgraded</application> from previous
!         releases will not have these benefits.
!        </para>
!       </listitem>
! 
!       <listitem>
! <!--
! see commits above
! -->
! 
!        <para>
!         Allow multi-column btree indexes to be smaller (Peter Geoghegan,
!         Heikki Linnakangas)
!        </para>
! 
!        <para>
!         Internal pages and min/max leaf page indicators now only store
!         index keys until the change key, rather than all indexed keys.
!         This also improves the locality of index access.
!        </para>
! 
!        <para>
!         Indexes <application>pg_upgraded</application> from previous
!         releases will not have these benefits.
!        </para>
!       </listitem>
! 
!       <listitem>
! <!--
! Author: Alexander Korotkov <akorot...@postgresql.org>
! 2018-07-28 [d2086b08b] Reduce path length for locking leaf B-tree pages during 
! -->
! 
!        <para>
!         Improve speed of btree index insertions by reducing locking
!         overhead (Alexander Korotkov)
         </para>
        </listitem>
  
*************** Author: Tom Lane <t...@sss.pgh.pa.us>
*** 678,702 ****
         </para>
        </listitem>
  
-       <listitem>
- <!--
- Author: Peter Geoghegan <p...@bowt.ie>
- 2019-03-20 [dd299df81] Make heap TID a tiebreaker nbtree index column.
- Author: Peter Geoghegan <p...@bowt.ie>
- 2019-03-20 [fab250243] Consider secondary factors during nbtree splits.
- -->
- 
-        <para>
-         Have new btree indexes sort duplicate index entries in heap-storage
-         order (Peter Geoghegan, Heikki Linnakangas)
-        </para>
- 
-        <para>
-         Indexes <application>pg_upgraded</application> from previous
-         releases will not have this ordering.
-        </para>
-       </listitem>
- 
        <listitem>
  <!--
  Author: Heikki Linnakangas <heikki.linnakan...@iki.fi>
--- 720,725 ----

Reply via email to