Hi Michael, Thanks for the suggestions. I agree that we should first solve the “last–free-bit” problem in varattrib_4b compression bits before layering on any features. Below is the approach I’ve prototyped to keep the header compact yet fully extensible, followed by a sketch of the plain-ZSTD(no dict) patch that sits cleanly on top of it.
1. Minimal but extensible header /* varatt_cmp_extended follows va_tcinfo when the upper two bits of * va_tcinfo are 11. Compressed data starts immediately after * ext_data. ext_hdr encodes both the compression algorithm and the * byte-length of the algorithm-specific metadata. */ typedef struct varatt_cmp_extended { uint32 ext_hdr; /* [ meta_size:24 | cmpr_id:8 ] */ char ext_data[FLEXIBLE_ARRAY_MEMBER]; /* optional metadata */ } varatt_cmp_extended; a. 24 bits for length → per-datum compression algorithm metadata is capped at 16 MB, which is far more than any realistic compression header. b. 8 bits for algorithm id → up to 256 algorithms. c. Zero-overhead when unused if an algorithm needs no per-datum metadata (e.g., ZSTD-nodict), 2. Algorithm registry /* * TOAST compression methods enumeration. * * Each entry defines: * - NAME : identifier for the compression algorithm * - VALUE : numeric enum value * - METADATA type: struct type holding extra info (void when none) * * The INVALID entry is a sentinel and must remain last. */ #define TOAST_COMPRESSION_LIST \ X(PGLZ, 0, void) /* existing */ \ X(LZ4, 1, void) /* existing */ \ X(ZSTD_NODICT, 2, void) /* new, no metadata */ \ X(ZSTD_DICT, 3, zstd_dict_meta) /* new, needs dict_id */ \ X(INVALID, 4, void) /* sentinel */ typedef enum ToastCompressionId { #define X(name,val,meta) TOAST_##name##_COMPRESSION_ID = val, TOAST_COMPRESSION_LIST #undef X } ToastCompressionId; /* Example of an algorithm-specific metadata block */ typedef struct { uint32 dict_id; /* dictionary Oid */ } zstd_dict_meta; 3. Resulting on-disk layouts for zstd ZSTD no dict: datum ondisk layout: +----------------------------------+ | va_header (uint32) | +----------------------------------+ | va_tcinfo (uint32) | (11 in top two bits specify extended) +----------------------------------+ | ext_hdr (uint32) | <-- [ meta size:24 bits | compression id:8 bits ] +----------------------------------+ | Compressed bytes … | <-- zstd (no dictionary) +----------------------------------+ ZSTD dict: datum ondisk layout +----------------------------------+ | va_header (uint32) | +----------------------------------+ | va_tcinfo (uint32) | +----------------------------------+ | ext_hdr (uint32) | <-- [ meta size:24 bits | compression id:8 bits ] +----------------------------------+ | dict_id (uint32) | <-- zstd_dict_meta +----------------------------------+ | Compressed bytes … | <-- zstd (dictionary) +----------------------------------+ 4. How does this fit? Flexibility: Each new algorithm that needs extra metadata simply defines its own struct and allocates varatt_cmp_extended in setup_compression_info. Storage: Everything in varatt_cmp_extended is copied to the datum, immediately followed by the compressed payload. Optional, pay-as-you-go metadata – only algorithms that need it pay for it. Future-proof – new compression algorithms, requires any kind of metadata like dictid or any other slot into the same ext_data mechanism. I’ve split the work into two patches for review: v19-0001-varattrib_4b-design-proposal-to-make-it-extended.patch: varattrib_4b extensibility – adds varatt_cmp_extended, enum plumbing, and macros; behaviour unchanged. v19-0002-zstd-nodict-support.patch: Plain ZSTD (non dict) support. Please share your thoughts—and I’d love to hear feedback on the design. Thanks! On Mon, Apr 21, 2025 at 12:02 AM Michael Paquier <mich...@paquier.xyz> wrote: > > On Fri, Apr 18, 2025 at 12:22:18PM -0400, Robert Haas wrote: > > I think we could add plain-old zstd compression without really > > tackling this issue, but if we are going to add dictionaries then I > > think we might need to revisit the idea of preventing things from > > leaking out of tables. What I can't quite remember at the moment is > > how much of the problem was that it was going to be slow to force the > > recompression, and how much of it was that we weren't sure we could > > even find all the places in the code that might need such handling. > > FWIW, this point resonates here. There is one thing that we have to > do anyway: we just have one bit left in the varlena headers as lz4 is > using the one before last. So we have to make it extensible, even if > it means that any compression method other than LZ4 and pglz would > consume one more byte in its header by default. And I think that this > has to happen at some point if we want flexibility in this area. > > + struct > + { > + uint32 va_header; > + uint32 va_tcinfo; > + uint32 va_cmp_alg; > + uint32 va_cmp_dictid; > + char va_data[FLEXIBLE_ARRAY_MEMBER]; > + } va_compressed_ext; > > Speaking of which, I am confused by this abstraction choice in > varatt.h in the first patch. Are we sure that we are always going to > have a dictionary attached to a compressed data set or even a > va_cmp_alg? It seems to me that this could lead to a waste of data in > some cases because these fields may not be required depending on the > compression method used, as some fields may not care about these > details. This kind of data should be made optional, on a per-field > basis. > > One thing that I've been wondering is how it would be possible to make > the area around varattrib_4b more readable while dealing with more > extensibility. It would be a good occasion to improve that, even if > I'm hand-waving here currently and that the majority of this code is > old enough to vote, with few modifications across the years. > > The second thing that I'd love to see on top of the addition of the > extensibility is adding plain compression support for zstd, with > nothing fancy, just the compression and decompression bits. I've done > quite a few benchmarks with the two, and results kind of point in the > direction that zstd is more efficient than lz4 overall. Don't take me > wrong: lz4 can be better in some workloads as it can consume less CPU > than zstd while compressing less. However, a comparison of ratios > like (compression rate / cpu used) has always led me to see zstd as > superior in a large number of cases. lz4 is still very good if you > are CPU-bound and don't care about the extra space required. Both are > three classes better than pglz. > > Once we have these three points incrementally built-in together (the > last bit extensibility, the potential varatt.h refactoring and the > zstd support), there may be a point in having support for more > advanced options with the compression methods in the shape of dicts or > more requirements linked to other compression methods, but I think the > topic is complex enough that we should make sure that these basics are > implemented in a way sane enough so as we'd be able to extend them > with all the use cases in mind. > -- > Michael -- Nikhil Veldanda -- Nikhil Veldanda
v19-0002-zstd-nodict-support.patch
Description: Binary data
v19-0001-varattrib_4b-design-proposal-to-make-it-extended.patch
Description: Binary data