On 2/19/18 10:00 AM, Tomas Vondra wrote:
So I don't think this is a very promising approach, unfortunately.

What I think might work is having a separate pool of autovac workers,
dedicated to these high-priority tables. That still would not guarantee
the high-priority tables are vacuumed immediately, but at least that
they are not stuck in the worker queue behind low-priority ones.

I wonder if we could detect tables with high update/delete activity and
promote them to high-priority automatically. The reasoning is that by
delaying the cleanup for those tables would result in significantly more
bloat than for those with low update/delete activity.

I've looked at this stuff in the past, and I think that the first step in trying to improve autovacuum needs to be allowing for a much more granular means of controlling worker table selection, and exposing that ability. There are simply too many different scenarios to try and account for to try and make a single policy that will satisfy everyone. Just as a simple example, OLTP databases (especially with small queue tables) have very different vacuum needs than data warehouses.

One fairly simple option would be to simply replace the logic that currently builds a worker's table list with running a query via SPI. That would allow for prioritizing important tables. It could also reduce the problem of workers getting "stuck" on a ton of large tables by taking into consideration the total number of pages/tuples a list contains.

A more fine-grained approach would be to have workers make a new selection after every vacuum they complete. That would provide the ultimate in control, since you'd be able to see exactly what all the other workers are doing.
--
Jim Nasby, Chief Data Architect, Austin TX
OpenSCG                 http://OpenSCG.com

Reply via email to