> What I'm thinking of is the regular indexscan that's done internally > by get_actual_variable_range, not whatever ends up getting chosen as > the plan for the user query. I had supposed that that would kill > dead index entries as it went, but maybe that's not happening for > some reason.
Really, this happens as you said. Index entries are marked as dead. But after this, backends spends cpu time on skip this killed entries in _bt_checkkeys : if (scan->ignore_killed_tuples && ItemIdIsDead(iid)) { /* return immediately if there are more tuples on the page */ if (ScanDirectionIsForward(dir)) { if (offnum < PageGetMaxOffsetNumber(page)) return NULL; } else { BTPageOpaque opaque = (BTPageOpaque) PageGetSpecialPointer(page); if (offnum > P_FIRSTDATAKEY(opaque)) return NULL; } This confirmed by perf records and backtrace reported by Vladimir earlier. root@pgload01e ~ # perf report | grep -v '^#' | head 56.67% postgres postgres [.] _bt_checkkeys 19.27% postgres postgres [.] _bt_readpage 2.09% postgres postgres [.] pglz_decompress 2.03% postgres postgres [.] LWLockAttemptLock 1.61% postgres postgres [.] PinBuffer.isra.3 1.14% postgres postgres [.] hash_search_with_hash_value 0.68% postgres postgres [.] LWLockRelease 0.42% postgres postgres [.] AllocSetAlloc 0.40% postgres postgres [.] SearchCatCache 0.40% postgres postgres [.] ReadBuffer_common root@pgload01e ~ # It seems like killing dead tuples does not solve this problem.