Hi, On 2023-01-26 23:11:41 -0800, Peter Geoghegan wrote: > > Essentially the "any fpi" logic is a very coarse grained way of using the > > page > > LSN as a measurement. As I said, I don't think "has a checkpoint occurred > > since the last write" is a good metric to avoid unnecessary freezing - it's > > too coarse. But I think using the LSN is the right thought. What about > > something like > > > > lsn_threshold = insert_lsn - (insert_lsn - lsn_of_last_vacuum) * 0.1 > > if (/* other conds */ && PageGetLSN(page) <= lsn_threshold) > > FreezeMe(); > > > > I probably got some details wrong, what I am going for with lsn_threshold is > > that we'd freeze an already dirty page if it's not been updated within 10% > > of > > the LSN distance to the last VACUUM. > > It seems to me that you're reinventing something akin to eager > freezing strategy here. At least that's how I define it, since now > you're bringing the high level context into it; what happens with the > table, with VACUUM operations, and so on. Obviously this requires > tracking the metadata that you suppose will be available in some way > or other, in particular things like lsn_of_last_vacuum.
I agree with bringing high-level context into the decision about whether to freeze agressively - my problem with the eager freezing strategy patch isn't that it did that too much, it's that it didn't do it enough. But I also don't think what I describe above is really comparable to "table level" eager freezing though - the potential worst case overhead is a small fraction of the WAL volume, and there's zero increase in data write volume. I suspect the absolute worst case of "always freeze dirty pages" is when a single tuple on the page gets updated immediately after every time we freeze the page - a single tuple is where the freeze record is the least space efficient. The smallest update is about the same size as the smallest freeze record. For that to amount to a large WAL increase you'd a crazy rate of such updates interspersed with vacuums. In slightly more realistic cases (i.e. not column less tuples that constantly get updated and freezing happening all the time) you end up with a reasonably small WAL rate overhead. That worst case of "freeze dirty" is bad enough to spend some brain and compute cycles to prevent. But if we don't always get it right in some workload, it's not *awful*. The worst case of the "eager freeze strategy" is a lot larger - it's probably something like updating one narrow tuple every page, once per checkpoint, so that each freeze generates an FPI. I think that results in a max overhead of 2x for data writes, and about 150x for WAL volume (ratio of one update record with an FPI). Obviously that's a pointless workload, but I do think that analyzing the "outer boundaries" of the regression something can cause, can be helpful. I think one way forward with the eager strategy approach would be to have a very narrow gating condition for now, and then incrementally expand it in later releases. One use-case where the eager strategy is particularly useful is [nearly-]append-only tables - and it's also the one workload that's reasonably easy to detect using stats. Maybe something like (dead_tuples_since_last_vacuum / inserts_since_last_vacuum) < 0.05 or so. That'll definitely leave out loads of workloads where eager freezing would be useful - but are there semi-reasonable workloads where it'll hurt badly? I don't *think* so. > What about unlogged/temporary tables? The obvious thing to do there is > what I did in the patch that was reverted (freeze whenever the page > will thereby become all-frozen), and forget about LSNs. But you have > already objected to that part, specifically. My main concern about that is the data write amplification it could cause when page is clean when we start freezing. But I can't see a large potential downside to always freezing unlogged/temp tables when the page is already dirty. > BTW, you still haven't changed the fact that you get rather different > behavior with checksums/wal_log_hints. I think that that's good, but > you didn't seem to. I think that, if we had something like the recency test I was talking about, we could afford to alway freeze when the page is already dirty and not very recently modified. I.e. not even insist on a WAL record having been generated during pruning/HTSV. But I need to think through the dangers of that more. Greetings, Andres Freund