My apologies for getting distracted from this work.  It might be a v20 item
at this point.  I haven't addressed any feedback since the v8 patch, but I
did some testing.

On Mon, Nov 24, 2025 at 02:59:33PM -0500, Robert Haas wrote:
> On Sun, Nov 23, 2025 at 4:55 AM David Rowley <[email protected]> wrote:
>> One thing that seems to be getting forgotten again is the "/* Stop
>> applying cost limits from this point on */" added in 1e55e7d17 is only
>> going to be applied when the table *currently* being vaccumed is over
>> the failsafe limit. Without Nathan's patch, the worker might end up
>> idling along carefully obeying the cost limits on dozens of other
>> tables before it gets around to vacuuming the table that's over the
>> failsafe limit, then suddenly drop the cost delay code and rush to get
>> the table frozen, before Postgres stops accepting transactions. With
>> the patch, Nathan has added some aggressive score scaling, which
>> should mean any table over the failsafe limit has the highest score
>> and gets attended to first.
> 
> Right, so can we use that to construct a specific, concrete scenario
> where we can see that the patch ends up delivering better behavior
> than we have today? I think it would be a really good to have at least
> one fully worked-out case where we can say "look, if you run this
> series of commands without the patch, X happens, and with the patch, Y
> happens, and look! Y is better."

I used the xid_wraparound module to artifically induce a situation that
looked like this:

     table_name |    age
    ------------+------------
     t1         | 1950000020
     t2         | 1560000016
     t3         | 1170000013
     t4         |  780000010
     t5         |  390000007
    (5 rows)

Each table had 1M updates, and all other tables on the cluster were frozen.
I created the tables in reverse so that t1 is listed later in pg_class than
t5.

Without the patch, autovacuum goes straight for t5, and then processes t4,
t3, etc.:

     table_name |    age
    ------------+------------
     t1         | 1950000021
     t2         | 1560000017
     t3         | 1170000014
     t4         |  780000011
     t5         |          1
    (5 rows)

With the patch, it processes t1 first:

     table_name |    age
    ------------+------------
     t2         | 1560000017
     t3         | 1170000014
     t4         |  780000011
     t5         |  390000008
     t1         |          1
    (5 rows)

I'll admit this is a pretty extreme/contrived example, but it at least
shows the intended behavior.  As alluded to elsewhere, this prioritization
work might be more useful once we are automatically adjusting the cost
limits in more cases.

-- 
nathan


Reply via email to