On Wed, Sep 21, 2022 at 1:14 PM Nathan Bossart <nathandboss...@gmail.com> wrote: > This idea seems promising. I see that you called this patch a > work-in-progress, so I'm curious what else you are planning to do with it.
I really just meant that the patch wasn't completely finished at that point. I hadn't yet convinced myself that I mostly had it right. I'm more confident now. > As I'm reading this thread and the patch, I'm finding myself wondering if > it's worth exploring using wal_compression for these records instead. The term deduplication works better than compression here because we're not actually decompressing anything in the REDO routine. Rather, the REDO routine processes each freeze plan by processing all affected tuples in order. To me this seems like the natural way to structure things -- the WAL records are much smaller, but in a way that's kind of incidental. The approach taken by the patch just seems like the natural approach, given the specifics of how freezing works at a high level. > I think you've essentially created an efficient compression mechanism for > this one type of record, but I'm assuming that lz4/zstd would also yield > some rather substantial improvements for this kind of data. I don't think of it that way. I've used the term "deduplication" to advertise the patch, but that's mostly just a description of what we're doing in the patch relative to what we do on HEAD today. There is nothing truly clever in the patch. We see a huge amount of redundancy among tuples from the same page in practically all cases, for reasons that have everything to do with what freezing is, and how it works at a high level. The thought process that led to my writing this patch was more high level than appearances suggest. (I often write patches that combine high level and low level insights in some way or other, actually.) Theoretically there might not be very much redundancy within each xl_heap_freeze_page record, with the right workload, but in practice a decrease of 4x or more is all but guaranteed once you have more than a few tuples to freeze on each page. If there are other WAL records that are as space inefficient as xl_heap_freeze_page is, then I'd be surprised -- it is *unusually* space inefficient (like I said, I suspect that this may have something to do with the fact that it was originally designed under time pressure). So I don't expect that this patch tells us much about what we should do for any other WAL record. I certainly *hope* that it doesn't, at least. > Presumably a > generic WAL record compression mechanism could be reused for other large > records, too. That could be much easier than devising a deduplication > strategy for every record type. It's quite possible that that's a good idea, but that should probably work as an additive thing. That's something that I think of as a "clever technique", whereas I'm focussed on just not being naive in how we represent this one specific WAL record type. -- Peter Geoghegan