On Mon, Mar 16, 2020 at 7:08 AM Michail Nikolaev <michail.nikol...@gmail.com> wrote: > I was sure I have broken something in btree and spent a lot of time > trying to figure what. > And later... I realized what it is bug in btree since a very old times... > Because of much faster scans with LP_DEAD support on a standby it > happens much more frequently in my case.
On second thought, I wonder how commit 558a9165 could possibly be relevant here. nbtree VACUUM doesn't care about the LP_DEAD bit at all. Sure, btree_xlog_delete_get_latestRemovedXid() is not going to have to run on the standby on Postgres 12, but that only ever happened at the point where we might have to split the page on the primary (i.e. when _bt_delitems_delete() is called on the primary) anyway. _bt_delitems_delete()/btree_xlog_delete_get_latestRemovedXid() are not related to page deletion by VACUUM. It's true that VACUUM will routinely kill tuples that happen to have their LP_DEAD bit set, but it isn't actually influenced by the fact that somebody set (or didn't set) any tuple's LP_DEAD bit. VACUUM has its own strategy for generating recovery conflicts (it relies on conflicts generated during the pruning phase of heap VACUUMing). VACUUM is not willing to generate ad-hoc conflicts (in the style of _bt_delitems_delete()) just to kill a few more tuples in relatively uncommon cases -- cases where some LP_DEAD bits were set after a VACUUM process started, but before the VACUUM process reached an affected (LP_DEAD bits set) leaf page. Again, I suspect that the problem is more likely to occur on Postgres 12 in practice because page deletion is more likely to occur on that version. IOW, due to my B-Tree work for Postgres 12: commit dd299df8, and related commits. That's probably all that there is to it. -- Peter Geoghegan