On Sat, Mar 05, 2022 at 07:26:39PM +0900, Michael Paquier wrote: > Repeatability and randomness of data counts, we could have for example > one case with a set of 5~7 int attributes, a second with text values > that include random data, up to 10~12 bytes each to count on the tuple > header to be able to compress some data, and a third with more > repeatable data, like one attribute with an int column populate > with generate_series(). Just to give an idea.
And that's what I did as of the attached set of test: - Cluster on tmpfs. - max_wal_size, min_wal_size at 2GB and shared_buffers at 1GB, aka large enough to include the full data set in memory. - Rather than using Justin's full patch set, I have just patched the code in xloginsert.c to switch the level. - One case with table that uses one int attribute, with rather repetitive data worth 484MB. - One case with table using (int, text), where the text data is made of 10~11 bytes of random data, worth ~340MB. - Use pg_prewarm to load the data into shared buffers. With the cluster mounted on a tmpfs that should not matter though. - Both tables have a fillfactor at 50 to give room to the updates. I have measured the CPU usage with a toy extension, also attached, called pg_rusage() that is a simple wrapper to upstream's pg_rusage.c to initialize a rusage snapshot and print its data with two SQL functions that are called just before and after the FPIs are generated (aka the UPDATE query that rewrites the whole table in the script attached). The quickly-hacked test script and the results are in test.tar.gz, for reference. The toy extension is pg_rusage.tar.gz. Here are the results I compiled, as of results_format.sql in the tarball attached: descr | rel_size | fpi_size | time_s -------------------------------+----------+----------+-------- int column no compression | 429 MB | 727 MB | 13.15 int column ztsd default level | 429 MB | 523 MB | 14.23 int column zstd level 1 | 429 MB | 524 MB | 13.94 int column zstd level 10 | 429 MB | 523 MB | 23.46 int column zstd level 19 | 429 MB | 523 MB | 103.71 int column lz4 default level | 429 MB | 575 MB | 13.37 int/text no compression | 344 MB | 558 MB | 10.08 int/text lz4 default level | 344 MB | 463 MB | 10.29 int/text zstd default level | 344 MB | 415 MB | 11.48 int/text zstd level 1 | 344 MB | 418 MB | 11.25 int/text zstd level 10 | 344 MB | 415 MB | 20.59 int/text zstd level 19 | 344 MB | 413 MB | 62.64 (12 rows) I did not expect zstd to be this slow at a level of 10~ actually. The runtime (elapsed CPU time) got severely impacted at level 19, that I ran just for fun to see how that it would become compared to a level of 10. There is a slight difference between the default level and a level of 1, and the compression size does not change much, nor does the CPU usage really change. Attached is an updated patch, while on it, that I have tweaked before running my own tests. At the end, I'd still like to think that we'd better stick with the default level for this parameter, and that's the suggestion of upstream. So I would like to move on with that for this patch. -- Michael
test.tar.gz
Description: application/gzip
pg_rusage.tar.gz
Description: application/gzip
From 254ddbf4223c35a7990e301e53d6ddbffcf210c0 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Fri, 18 Feb 2022 22:54:18 -0600 Subject: [PATCH v2] add wal_compression=zstd --- src/include/access/xlog.h | 3 +- src/include/access/xlogrecord.h | 5 +++- src/backend/access/transam/xloginsert.c | 30 ++++++++++++++++++- src/backend/access/transam/xlogreader.c | 20 +++++++++++++ src/backend/utils/misc/guc.c | 3 ++ src/backend/utils/misc/postgresql.conf.sample | 2 +- src/bin/pg_waldump/pg_waldump.c | 2 ++ doc/src/sgml/config.sgml | 11 ++++--- doc/src/sgml/installation.sgml | 8 +++++ 9 files changed, 76 insertions(+), 8 deletions(-) diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 4b45ac64db..09f6464331 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -75,7 +75,8 @@ typedef enum WalCompression { WAL_COMPRESSION_NONE = 0, WAL_COMPRESSION_PGLZ, - WAL_COMPRESSION_LZ4 + WAL_COMPRESSION_LZ4, + WAL_COMPRESSION_ZSTD } WalCompression; /* Recovery states */ diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h index c1b1137aa7..052ac6817a 100644 --- a/src/include/access/xlogrecord.h +++ b/src/include/access/xlogrecord.h @@ -149,8 +149,11 @@ typedef struct XLogRecordBlockImageHeader /* compression methods supported */ #define BKPIMAGE_COMPRESS_PGLZ 0x04 #define BKPIMAGE_COMPRESS_LZ4 0x08 +#define BKPIMAGE_COMPRESS_ZSTD 0x10 + #define BKPIMAGE_COMPRESSED(info) \ - ((info & (BKPIMAGE_COMPRESS_PGLZ | BKPIMAGE_COMPRESS_LZ4)) != 0) + ((info & (BKPIMAGE_COMPRESS_PGLZ | BKPIMAGE_COMPRESS_LZ4 | \ + BKPIMAGE_COMPRESS_ZSTD)) != 0) /* * Extra header information used when page image has "hole" and diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c index c260310c4c..b61c08e586 100644 --- a/src/backend/access/transam/xloginsert.c +++ b/src/backend/access/transam/xloginsert.c @@ -44,9 +44,17 @@ #define LZ4_MAX_BLCKSZ 0 #endif +#ifdef USE_ZSTD +#include <zstd.h> +#define ZSTD_MAX_BLCKSZ ZSTD_COMPRESSBOUND(BLCKSZ) +#else +#define ZSTD_MAX_BLCKSZ 0 +#endif + +/* Buffer size required to store a compressed version of backup block image */ #define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) -#define COMPRESS_BUFSIZE Max(PGLZ_MAX_BLCKSZ, LZ4_MAX_BLCKSZ) +#define COMPRESS_BUFSIZE Max(Max(PGLZ_MAX_BLCKSZ, LZ4_MAX_BLCKSZ), ZSTD_MAX_BLCKSZ) /* * For each block reference registered with XLogRegisterBuffer, we fill in @@ -695,6 +703,14 @@ XLogRecordAssemble(RmgrId rmid, uint8 info, #endif break; + case WAL_COMPRESSION_ZSTD: +#ifdef USE_ZSTD + bimg.bimg_info |= BKPIMAGE_COMPRESS_ZSTD; +#else + elog(ERROR, "ZSTD is not supported by this build"); +#endif + break; + case WAL_COMPRESSION_NONE: Assert(false); /* cannot happen */ break; @@ -903,6 +919,18 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, #endif break; + case WAL_COMPRESSION_ZSTD: +#ifdef USE_ZSTD + /* Uses level=1, not ZSTD_CLEVEL_DEFAULT */ + len = ZSTD_compress(dest, COMPRESS_BUFSIZE, source, orig_len, + ZSTD_CLEVEL_DEFAULT); + if (ZSTD_isError(len)) + len = -1; /* failure */ +#else + elog(ERROR, "ZSTD is not supported by this build"); +#endif + break; + case WAL_COMPRESSION_NONE: Assert(false); /* cannot happen */ break; diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 35029cf97d..d60e4cbaf6 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -21,6 +21,9 @@ #ifdef USE_LZ4 #include <lz4.h> #endif +#ifdef USE_ZSTD +#include <zstd.h> +#endif #include "access/transam.h" #include "access/xlog_internal.h" @@ -1618,6 +1621,23 @@ RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page) "LZ4", block_id); return false; +#endif + } + else if ((bkpb->bimg_info & BKPIMAGE_COMPRESS_ZSTD) != 0) + { +#ifdef USE_ZSTD + size_t decomp_result = ZSTD_decompress(tmp.data, + BLCKSZ-bkpb->hole_length, + ptr, bkpb->bimg_len); + + if (ZSTD_isError(decomp_result)) + decomp_success = false; +#else + report_invalid_record(record, "image at %X/%X compressed with %s not supported by build, block %d", + LSN_FORMAT_ARGS(record->ReadRecPtr), + "zstd", + block_id); + return false; #endif } else diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 6d11f9c71b..e7f0a380e6 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -550,6 +550,9 @@ static const struct config_enum_entry wal_compression_options[] = { {"pglz", WAL_COMPRESSION_PGLZ, false}, #ifdef USE_LZ4 {"lz4", WAL_COMPRESSION_LZ4, false}, +#endif +#ifdef USE_ZSTD + {"zstd", WAL_COMPRESSION_ZSTD, false}, #endif {"on", WAL_COMPRESSION_PGLZ, false}, {"off", WAL_COMPRESSION_NONE, false}, diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 4a094bb38b..4cf5b26a36 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -220,7 +220,7 @@ #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) #wal_compression = off # enables compression of full-page writes; - # off, pglz, lz4, or on + # off, pglz, lz4, zstd, or on #wal_init_zero = on # zero-fill new WAL files #wal_recycle = on # recycle WAL files #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 2340dc247b..f128050b4e 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -562,6 +562,8 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record) method = "pglz"; else if ((bimg_info & BKPIMAGE_COMPRESS_LZ4) != 0) method = "lz4"; + else if ((bimg_info & BKPIMAGE_COMPRESS_ZSTD) != 0) + method = "zstd"; else method = "unknown"; diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 7ed8c82a9d..5612e80453 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3154,10 +3154,13 @@ include_dir 'conf.d' server compresses full page images written to WAL when <xref linkend="guc-full-page-writes"/> is on or during a base backup. A compressed page image will be decompressed during WAL replay. - The supported methods are <literal>pglz</literal> and - <literal>lz4</literal> (if <productname>PostgreSQL</productname> was - compiled with <option>--with-lz4</option>). The default value is - <literal>off</literal>. Only superusers can change this setting. + The supported methods are <literal>pglz</literal>, + <literal>lz4</literal> (if <productname>PostgreSQL</productname> + was compiled with <option>--with-lz4</option>) and + <literal>zstd</literal> (if <productname>PostgreSQL</productname> + was compiled with <option>--with-zstd</option>) and + The default value is <literal>off</literal>. + Only superusers can change this setting. </para> <para> diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 0f74252590..bd09003d59 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -271,6 +271,14 @@ su - postgres </para> </listitem> + <listitem> + <para> + You need <productname>ZSTD</productname>, if you want to support + compression of data with this method; see + <xref linkend="guc-wal-compression"/>. + </para> + </listitem> + <listitem> <para> To build the <productname>PostgreSQL</productname> documentation, -- 2.35.1
signature.asc
Description: PGP signature