Currently, we restrict btree index tuples to a size that ensures three of them will fit on a page. The motivation for this is the following two considerations:
1. In a non-rightmost page, we need to include a "high key", or page boundary key, that isn't one of the useful data keys. 2. In a non-leaf page, there had better be at least two child pages (downlink entries), else we have failed to subdivide the page's key range at all, and thus there would be a nonterminating recursion. However: a non-leaf page actually has one more pointer than key, eg a page with three children needs only two data keys: ---------------- entire key range assigned to page ------------------ -- range 1 -- boundary key -- range 2 -- boundary key -- range 3 -- | | | v v v child page 1 child page 2 child page 3 We implement this by having the first data "tuple" on a non-leaf page contain only a downlink TID and no key data, ie it's just the header. So it appears to me that we could allow the maximum size of a btree entry to be just less than half a page, rather than just less than a third of a page --- the worst-case requirement for a non-leaf page is not three real tuples, but one tuple header and two real tuples. On a leaf page we might manage to fit only one real data item, but AFAICS that doesn't pose any correctness problems. Obviously a tree containing many such pages would be awfully inefficient to search, but I think a more common case is that there are a few wide entries in an index of mostly short entries, and so pushing the hard limit up a little would add some flexibility with little performance cost in real-world cases. Have I missed something? Is this worth changing? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly