My apologies for getting distracted from this work. It might be a v20 item at this point. I haven't addressed any feedback since the v8 patch, but I did some testing.
On Mon, Nov 24, 2025 at 02:59:33PM -0500, Robert Haas wrote: > On Sun, Nov 23, 2025 at 4:55 AM David Rowley <[email protected]> wrote: >> One thing that seems to be getting forgotten again is the "/* Stop >> applying cost limits from this point on */" added in 1e55e7d17 is only >> going to be applied when the table *currently* being vaccumed is over >> the failsafe limit. Without Nathan's patch, the worker might end up >> idling along carefully obeying the cost limits on dozens of other >> tables before it gets around to vacuuming the table that's over the >> failsafe limit, then suddenly drop the cost delay code and rush to get >> the table frozen, before Postgres stops accepting transactions. With >> the patch, Nathan has added some aggressive score scaling, which >> should mean any table over the failsafe limit has the highest score >> and gets attended to first. > > Right, so can we use that to construct a specific, concrete scenario > where we can see that the patch ends up delivering better behavior > than we have today? I think it would be a really good to have at least > one fully worked-out case where we can say "look, if you run this > series of commands without the patch, X happens, and with the patch, Y > happens, and look! Y is better." I used the xid_wraparound module to artifically induce a situation that looked like this: table_name | age ------------+------------ t1 | 1950000020 t2 | 1560000016 t3 | 1170000013 t4 | 780000010 t5 | 390000007 (5 rows) Each table had 1M updates, and all other tables on the cluster were frozen. I created the tables in reverse so that t1 is listed later in pg_class than t5. Without the patch, autovacuum goes straight for t5, and then processes t4, t3, etc.: table_name | age ------------+------------ t1 | 1950000021 t2 | 1560000017 t3 | 1170000014 t4 | 780000011 t5 | 1 (5 rows) With the patch, it processes t1 first: table_name | age ------------+------------ t2 | 1560000017 t3 | 1170000014 t4 | 780000011 t5 | 390000008 t1 | 1 (5 rows) I'll admit this is a pretty extreme/contrived example, but it at least shows the intended behavior. As alluded to elsewhere, this prioritization work might be more useful once we are automatically adjusting the cost limits in more cases. -- nathan
