On Fri, 1 Mar 2024 at 14:48, Matthias van de Meent <boekewurm+postg...@gmail.com> wrote: > Attached is version 15 of this patch, with the above issues fixed. > It's also rebased on top of 655dc310 of this morning, so that should > keep good for some time again.
Attached is version 16 now. Relevant changes from previous patch versions: - Move from 1-indexed AttributeNumber to 0-indexed ints for prefixes, and use "prefix" as naming scheme, rather than cmpcol. A lack of prefix, previously indicated with a cmpcol value of 1, is now a prefix value of 0. - Adjusted README - Improved the efficiency of the insertion path in some cases where we've previously compared the page's highkey. As always, why we need this: Currently, btrees are quite inefficient when they have many key attributes but low attribute cardinality in the prefix, e.g. an index on ("", "", "", uuid). This is not just inefficient use of disk space with the high repetition of duplicate prefix values in pages, but it is also a computational overhead when we're descending the tree in e.g. _bt_first() or btinsert(): The code we use to search a page currently compares the full key to the full searchkey, for a complexity of O(n_equal_attributes + 1) for every tuple on the page, for O(log(page_ntups) * (n_equal_attributes + 1)) attribute compares every page during descent. This patch fixes that part of the computational complexity by applying dynamic prefix compression, thus reducing the average computational complexity in random index lookups to O(3 * (n_equal_attributes) + log(page_ntups)) per page (assuming at least 4 non-hikey tuples on each page). In practice, this makes indexes with 3+ attributes and prefixes with low selectivity (such as the example above) much more viable computationally, as we have to spend much less effort on comparing known index attributes during descent. Note that this does _not_ reuse prefix bounds across pages - it re-establishes the left- and right prefixes every page during the binary search. See the README modified in the patch for specific implementation details and considerations. This patch synergizes with the highkey optimization used in [0]: When combined, the number of attribute compare function calls could be further reduced to O(2 * (n_equal_atts) + log(page_ntups)), a reduction by n_equal_atts every page, which in certain wide index types could be over 25% of all attribute compare function calls on the page after dynamic prefix truncation. However, both are separately useful and reduce the amount of work done on most pages. Kind regards, Matthias van de Meent Neon (https://neon.tech) [0] https://www.postgresql.org/message-id/flat/CAEze2WijWhCQy_nZVP4Ye5Hwj=YW=3rqv+hbmjgcohtryqm...@mail.gmail.com
v16-0001-btree-Implement-dynamic-prefix-truncation.patch
Description: Binary data