> What I'm thinking of is the regular indexscan that's done internally
> by get_actual_variable_range, not whatever ends up getting chosen as
> the plan for the user query.  I had supposed that that would kill
> dead index entries as it went, but maybe that's not happening for
> some reason.


Really, this happens as you said. Index entries are marked as dead.
But after this, backends spends cpu time on skip this killed entries
in _bt_checkkeys :

        if (scan->ignore_killed_tuples && ItemIdIsDead(iid))
        {
                /* return immediately if there are more tuples on the page */
                if (ScanDirectionIsForward(dir))
                {
                        if (offnum < PageGetMaxOffsetNumber(page))
                                return NULL;
                }
                else
                {
                        BTPageOpaque opaque = (BTPageOpaque) 
PageGetSpecialPointer(page);

                        if (offnum > P_FIRSTDATAKEY(opaque))
                                return NULL;
                }

This confirmed by perf records and backtrace reported by Vladimir earlier.
root@pgload01e ~ # perf report | grep -v '^#' | head
    56.67%  postgres   postgres                [.] _bt_checkkeys
    19.27%  postgres   postgres                [.] _bt_readpage
     2.09%  postgres   postgres                [.] pglz_decompress
     2.03%  postgres   postgres                [.] LWLockAttemptLock
     1.61%  postgres   postgres                [.] PinBuffer.isra.3
     1.14%  postgres   postgres                [.] hash_search_with_hash_value
     0.68%  postgres   postgres                [.] LWLockRelease
     0.42%  postgres   postgres                [.] AllocSetAlloc
     0.40%  postgres   postgres                [.] SearchCatCache
     0.40%  postgres   postgres                [.] ReadBuffer_common
root@pgload01e ~ #
It seems like killing dead tuples does not solve this problem.

Reply via email to