On Mon, Feb 1, 2021 at 10:17 PM Peter Geoghegan <p...@bowt.ie> wrote: > * No need to change MaxHeapTuplesPerPage for now, since that only > really makes sense in cases that heavily involve bottom-up deletion, > where we care about the *concentration* of LP_DEAD line pointers in > heap pages (and not just the absolute number in the entire table), > which is qualitative, not quantitative (somewhat like bottom-up > deletion). > > The change to MaxHeapTuplesPerPage that Masahiko has proposed does > make sense -- there are good reasons to increase it. Of course there > are also good reasons to not do so. I'm concerned that we won't have > time to think through all the possible consequences.
Yes, I agree that it's good to postpone this to a future release, and that thinking through the consequences is not so easy. One possible consequence that I'm concerned about is sequential scan performance. For an index scan, you just jump to the line pointer you want and then go get the tuple, but a sequential scan has to loop over all the line pointers on the page, and skipping a lot of dead ones can't be completely free. A small increase in MaxHeapTuplesPerPage probably wouldn't matter, but the proposed increase of almost 10x (291 -> 2042) is a bit scary. It's also a little hard to believe that letting almost 50% of the total space on the page get chewed up by the line pointer array is going to be optimal. If that happens to every page while the amount of data stays the same, the table must almost double in size. That's got to be bad. The whole thing would be more appealing if there were some way to exert exponentially increasing back-pressure on the length of the line pointer array - that is, make it so that the longer the array is already, the less willing we are to extend it further. But I don't really see how to do that. Also, at the risk of going on and on, line pointer array bloat is very hard to eliminate once it happens. We never even try to shrink the line pointer array, and if the last TID in the array is still in use, it wouldn't be possible anyway, assuming the table has at least one non-BRIN index. Index page splits are likewise irreversible, but creating a new index and dropping the old one is still less awful than having to rewrite the table. Another thing to consider is that MaxHeapTuplesPerPage is used to size some stack-allocated arrays, especially the stack-allocated PruneState. I thought for a while about this and I can't really see why it would be a big problem, even with a large increase in MaxHeapTuplesPerPage, so I'm just mentioning this in case it makes somebody else think of something I've missed. -- Robert Haas EDB: http://www.enterprisedb.com