Hi Melanie,
On 8/31/23 02:59, Melanie Plageman wrote:
I created a large table and then updated a tuple on every page in the
relation and vacuumed it. I saw a consistent slight improvement in
vacuum execution time. I profiled a bit with perf stat as well. The
difference is relatively small for this kind of example, but I can
work on a more compelling, realistic example. I think eliminating the
retry loop is also useful, as I have heard that users have had to
cancel vacuums which were in this retry loop for too long.
Just to provide a specific test case, if you create a small table like this
create table foo (a int, b int, c int) with(autovacuum_enabled=false);
insert into foo select i, i, i from generate_series(1, 10000000);
And then vacuum it. I find that with my patch applied I see a
consistent ~9% speedup (averaged across multiple runs).
master: ~533ms
patch: ~487ms
And in the profile, with my patch applied, you notice less time spent
in HeapTupleSatisfiesVacuumHorizon()
master:
11.83% postgres postgres [.] heap_page_prune
11.59% postgres postgres [.] heap_prepare_freeze_tuple
8.77% postgres postgres [.] lazy_scan_prune
8.01% postgres postgres [.] HeapTupleSatisfiesVacuumHorizon
7.79% postgres postgres [.] heap_tuple_should_freeze
patch:
13.41% postgres postgres [.] heap_prepare_freeze_tuple
9.88% postgres postgres [.] heap_page_prune
8.61% postgres postgres [.] lazy_scan_prune
7.00% postgres postgres [.] heap_tuple_should_freeze
6.43% postgres postgres [.] HeapTupleSatisfiesVacuumHorizon
Thanks a lot for providing additional information and the test case.
I tried it on a release build and I also see a 10% speed-up. I reset the
visibility map between VACUUM runs, see:
CREATE EXTENSION pg_visibility; CREATE TABLE foo (a INT, b INT, c INT)
WITH(autovacuum_enabled=FALSE); INSERT INTO foo SELECT i, i, i from
generate_series(1, 10000000) i; VACUUM foo; SELECT
pg_truncate_visibility_map('foo'); VACUUM foo; SELECT
pg_truncate_visibility_map('foo'); VACUUM foo; ...
The first patch, which refactors the code so we can pass the result of
the visibility checks to the caller, looks good to me.
Regarding the 2nd patch (disclaimer: I'm not too familiar with that area
of the code): I don't completely understand why the retry loop is not
needed anymore and how you now detect/handle the possible race
condition? It can still happen that an aborting transaction changes the
state of a row after heap_page_prune() looked at that row. Would that
case now not be ignored?
--
David Geier
(ServiceNow)