On Wed, Jan 29, 2020 at 1:15 PM Peter Geoghegan <p...@bowt.ie> wrote: > The good news is that these extra cycles aren't very noticeable even > with a workload where deduplication doesn't help at all (e.g. with > several indexes an append-only table, and few or no duplicates). The > cycles are generally a fixed cost. Furthermore, it seems to be > possible to virtually avoid the problem in the case of unique indexes > by applying the incoming-item-is-duplicate heuristic. Maybe I am > worrying over nothing.
Yeah, maybe. I'm tempted to advocate for dropping the GUC and keeping the reloption. If the worst case is a 3% regression and you expect that to be rare, I don't think a GUC is really worth it, especially given that the proposed semantics seem somewhat confusing. The reloption can be used in a pinch to protect against either bugs or performance regressions, whichever may occur, and it doesn't seem like you need a second mechanism. > Again, maybe I'm making an excessively thin distinction. I really want > to be able to enable the feature everywhere, while also not getting > even one complaint about it. Perhaps that's just not a realistic or > useful goal. One thing that you could do is try to learn whether deduplication (I really don't like that name, but here we are) seems to be working for a given index, perhaps even in a given session. For instance, suppose you keep track of what happened the last ten times the current session attempted deduplication within a given index. Store the state in the relcache. If all of the last ten tries were failures, then only try 1/4 of the time thereafter. If you have a success, go back to trying every time. That's pretty crude, but it would might be good enough to blunt the downsides to the point where you can stop worrying. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company