So I've been looking at issues we used to have in production some time ago which eventually lead us to migrating away from partial indexes in some cases. In the end, I'm surprised how easy this (or at least a similar case) was to reproduce. The attached program does some UPDATEs where around every third update deletes the row from the partial index since it doesn't match indpred anymore. In that case the row is immediately UPDATEd back to match the index WHERE clause again. This roughly emulates what some of our processes do in production.
Today, running the program for a few minutes (until the built-in 262144 iteration limit), I usually end up with a partial index through which producing the only row takes milliseconds on a cold cache, and over a millisecond on a hot one. Finding the row through the primary key is still fast, because the bloat there gets cleaned up. As far as I can tell, after the index has gotten into this state, there's no way to clean it up except VACUUMing the entire table or a REINDEX. Both solutions are pretty bad. My working theory was that this has to do with the fact that HeapTupleSatisfiesMVCC doesn't set the HEAP_XMAX_COMMITTED bit here, but I'm not so sure anymore. Has anyone seen something like this? If that really is what's happening here, then I can see why we wouldn't want to slow down SELECTs with expensive visibility checks. But that really leaves me wishing for something like VACUUM INDEX partial_idx. Otherwise your elephant just keeping getting slower and slower until you get called at 2 AM to play REINDEX. (I've tested this on 9.6, v11 and v13. 13 seems to be a bit better here, but not "fixed", I think.) .m
#!perl use strict; use warnings; use DBI; use DBD::Pg; my @connect = ("dbi:Pg:", '', '', {pg_enable_utf8 => 1, RaiseError => 1, PrintError => 0, AutoCommit => 1}); my $dbh = DBI->connect(@connect) or die; $dbh->do(q{ DROP TABLE IF EXISTS t1; CREATE TABLE t1 ( id serial PRIMARY KEY, partial boolean NOT NULL, data bigint NOT NULL ); ALTER TABLE t1 SET (autovacuum_enabled = false); INSERT INTO t1(partial, data) SELECT FALSE, -gs.i FROM generate_series(1, 1000000) gs(i); CREATE INDEX i_love_partial_indexes ON t1 (id) WHERE partial; CREATE INDEX pkish ON t1 ((id::text)); INSERT INTO t1 (partial, data) VALUES (TRUE, 0); }); $dbh->do(q{VACUUM ANALYZE t1}); my $rows = $dbh->selectall_arrayref(q{SELECT id FROM t1 WHERE partial}); die unless (scalar(@$rows) == 1 && scalar(@{$rows->[0]}) == 1); my $partial_id = $rows->[0]->[0]; for (my $i = 0; $i < 256 * 1024; ++$i) { my $rows = $dbh->selectall_arrayref( q{UPDATE t1 SET data = data + 1, partial = random() < 0.7 WHERE id = $1 RETURNING partial::int}, undef, $partial_id, ); die unless (scalar(@$rows) == 1 && scalar(@{$rows->[0]}) == 1); my $partial = $rows->[0]->[0]; if ($partial == 1) { $dbh->do( q{UPDATE t1 SET partial = TRUE WHERE id = $1}, undef, $partial_id, ); } }