Robert Haas <robertmh...@gmail.com> writes: > I have committed this version.
This failure says that the test case is not entirely stable: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2020-09-12%2005%3A13%3A12 diff -U3 /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/expected/heap_surgery.out /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/results/heap_surgery.out --- /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/expected/heap_surgery.out 2020-09-11 06:31:36.000000000 +0000 +++ /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/results/heap_surgery.out 2020-09-12 11:40:26.000000000 +0000 @@ -116,7 +116,6 @@ vacuum freeze htab2; -- unused TIDs should be skipped select heap_force_kill('htab2'::regclass, ARRAY['(0, 2)']::tid[]); - NOTICE: skipping tid (0, 2) for relation "htab2" because it is marked unused heap_force_kill ----------------- sungazer's first run after pg_surgery went in was successful, so it's not a hard failure. I'm guessing that it's timing dependent. The most obvious theory for the cause is that what VACUUM does with a tuple depends on whether the tuple's xmin is below global xmin, and a concurrent autovacuum could very easily be holding back global xmin. While I can't easily get autovac to run at just the right time, I did verify that a concurrent regular session holding back global xmin produces the symptom seen above. (To replicate, insert "select pg_sleep(...)" in heap_surgery.sql before "-- now create an unused line pointer"; run make installcheck; and use the delay to connect to the database manually, start a serializable transaction, and do any query to acquire a snapshot.) I suggest that the easiest way to make this test reliable is to make the test tables be temp tables (which allows dropping the autovacuum_enabled = off property, too). In the wake of commit a7212be8b, that should guarantee that vacuum has stable tuple-level behavior regardless of what is happening concurrently. regards, tom lane