Hi, On 2023-09-26 09:07:13 -0700, Peter Geoghegan wrote: > On Tue, Sep 26, 2023 at 8:19 AM Andres Freund <and...@anarazel.de> wrote: > > However, I'm not at all convinced doing this on a system wide level is a > > good > > idea. Databases do often contain multiple types of workloads at the same > > time. E.g., we want to freeze aggressively in a database that has the bulk > > of > > its size in archival partitions but has lots of unfrozen data in an active > > partition. And databases have often loads of data that's going to change > > frequently / isn't long lived, and we don't want to super aggressively > > freeze > > that, just because it's a large portion of the data. > > I didn't say that we should always have most of the data in the > database frozen, though. Just that we can reasonably be more lazy > about freezing the remainder of pages if we observe that most pages > are already frozen. How they got that way is another discussion. > > I also think that the absolute amount of debt (measured in physical > units such as unfrozen pages) should be kept under control. But that > isn't something that can ever be expected to work on the basis of a > simple threshold -- if only because autovacuum scheduling just doesn't > work that way, and can't really be adapted to work that way.
I don't think doing this on a system wide basis with a metric like #unfrozen pages is a good idea. It's quite common to have short lived data in some tables while also having long-lived data in other tables. Making opportunistic freezing more aggressive in that situation will just hurt, without a benefit (potentially even slowing down the freezing of older data!). And even within a single table, making freezing more aggressive because there's a decent sized part of the table that is updated regularly and thus not frozen, doesn't make sense. If we want to take global freeze debt into account, which I think is a good idea, we'll need a smarter way to represent the debt than just the number of unfrozen pages. I think we would need to track the age of unfrozen pages in some way. If there are a lot of unfrozen pages with a recent xid, then it's fine, but if they are older and getting older, it's a problem and we need to be more aggressive. The problem I see is how track the age of unfrozen data - it'd be easy enough to track the mean(oldest-64bit-xid-on-page), but then we again have the issue of rare outliers moving the mean too much... Greetings, Andres Freund