On Thu, Jan 16, 2020 at 3:05 PM Peter Geoghegan <p...@bowt.ie> wrote: > The main reason that I am confident about unique indexes is that we > only do a deduplication pass in a unique index when we observe that > the incoming tuple (the one that might end up splitting the page) is a > duplicate of some existing tuple. Checking that much is virtually > free, since we already have the information close at hand today (we > cache the _bt_check_unique() binary search bounds for reuse within > _bt_findinsertloc() today). This seems to be an excellent heuristic, > since we really only want to target unique index leaf pages where all > or almost all insertions must be duplicates caused by non-HOT updates > -- this category includes all the pgbench indexes, and includes all of > the unique indexes in TPC-C. Whereas with non-unique indexes, we > aren't specifically targeting version churn (though it will help with > that too).
This (and the rest of the explanation) don't really address my concern. I understand that deduplicating in lieu of splitting a page in a unique index is highly likely to be a win. What I don't understand is why it shouldn't just be a win, period. Not splitting a page seems like it has a big upside regardless of whether the index is unique -- and in fact, the upside could be a lot bigger for a non-unique index. If the coarse-grained LP_DEAD thing is the problem, then I can grasp that issue, but you don't seem very worried about that. Generally, I think it's a bad idea to give the user an "emergency off switch" and then sometimes ignore it. If the feature seems to be generally beneficial, but you're worried that there might be regressions in obscure cases, then turn it on by default, and give the user the ability to forcibly turn it off. But don't give the the opportunity to forcibly turn it off sometimes. Nobody's going to run around setting a reloption just for fun -- they're going to do it because they hit a problem. I guess I'm also saying here that a reloption seems like a much better idea than a GUC. I don't see much reason to believe that a system-wide setting will be useful. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company