On Wed, Jun 16, 2021 at 09:39:57AM +0900, Michael Paquier wrote: > From I'd like us to finish with here is one new algorithm method, able > to cover a large range of cases as mentioned upthread, from > low-CPU/low-compression to high-CPU/high-compression. It does not > seem like a good idea to be stuck with an algo that only specializes > in one or the other, for example.
So, I have been playing with that. And the first thing I have done before running any benchmark was checking the logic of the patch, that I have finished to heavily clean up. This is still WIP (see the various XXX), and it still includes all the compression methods we are discussing here, but it allows to control the level of the compression and it is in a much better shape. So that will help. Attached are two patches, the WIP version I have simplified (there were many things I found confusing, from the set of header dependencies added across the code to unnecessary code, the set of patches in the series as mentioned upthread, etc.) that I have used for the benchmarks. The second patch is a tweak to grab getrusage() stats for the lifetime of a backend. The benchmark I have used is rather simple, as follows, with a value of shared_buffers that allows to fit all the pages of the relation in. I then just mounted the instance on a tmpfs while adapting wal_compression* for each test. This gives a fixed amount of FPWs generated, large enough to reduce any noise and to still allow to any difference: #!/bin/bash psql <<EOF -- Change your conf here SET wal_compression = zstd; SET wal_compression_level = 20; SELECT pg_backend_pid(); DROP TABLE IF EXISTS aa, results; CREATE TABLE aa (a int); CREATE TABLE results (phase text, position pg_lsn); CREATE EXTENSION IF NOT EXISTS pg_prewarm; ALTER TABLE aa SET (FILLFACTOR = 50); INSERT INTO results VALUES ('pre-insert', pg_current_wal_lsn()); INSERT INTO aa VALUES (generate_series(1,7000000)); -- 484MB SELECT pg_size_pretty(pg_relation_size('aa'::regclass)); SELECT pg_prewarm('aa'::regclass); CHECKPOINT; INSERT INTO results VALUES ('pre-update', pg_current_wal_lsn()); UPDATE aa SET a = 7000000 + a; CHECKPOINT; INSERT INTO results VALUES ('post-update', pg_current_wal_lsn()); SELECT * FROM results; EOF The set of results, with various compression levels used gives me the following (see also compression_results.sql attached): wal_compression | user_diff | sys_diff | rel_size | fpi_size ------------------------------+------------+----------+----------+---------- lz4 level=1 | 24.219464 | 0.427996 | 429 MB | 574 MB lz4 level=65535 (speed mode) | 24.154747 | 0.524067 | 429 MB | 727 MB off | 24.069323 | 0.635622 | 429 MB | 727 MB pglz | 36.123642 | 0.451949 | 429 MB | 566 MB zlib level=1 (default) | 27.454397 | 2.25989 | 429 MB | 527 MB zlib level=9 | 31.962234 | 2.160444 | 429 MB | 527 MB zstd level=0 | 24.766077 | 0.67174 | 429 MB | 523 MB zstd level=20 | 114.429589 | 0.495938 | 429 MB | 520 MB zstd level=3 (default) | 25.218323 | 0.475974 | 429 MB | 523 MB (9 rows) There are a couple of things that stand out here: - zlib has a much higher user CPU time than zstd and lz4, so we could just let this one go. - Everything is better than pglz, that does not sound as a surprise. - The level does not really influence the compression reached -- lz4 aims at being fast, so its default is actually the best compression it can do. Using a much high acceleration level reduces the effects of compression to zero. -- zstd has a high CPU consumption at high level (level > 20 is classified as ultra, I have not tested that), without helping much with the amount of data compressed. It seems to me that this would leave LZ4 or zstd as obvious choices, and that we don't really need to care about the compression level, so let's just stick with the defaults without any extra GUCs. Among the remaining two I would be tempted to choose LZ4. That's consistent with what toast can use now. And, even if it is a bit worse than pglz in terms of compression in this case, it shows a CPU usage close to the "off" case, which is nice (sys_diff for lz4 with level=1 is a bit suspicious by the way). zstd has merits as well at default level. At the end I am not surprised by this result: LZ4 is designed to be faster, while zstd compresses more and eats more CPU. Modern compression algos are nice. -- Michael
From d20de46ee91b753a82378960cc51b1f9e527507f Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Wed, 16 Jun 2021 14:19:16 +0900 Subject: [PATCH v10 1/2] Add more options for wal_compression --- src/include/access/xlog.h | 28 ++- src/include/access/xlogrecord.h | 17 +- src/include/pg_config.h.in | 9 + src/backend/Makefile | 2 +- src/backend/access/transam/xlog.c | 3 +- src/backend/access/transam/xloginsert.c | 114 ++++++++- src/backend/access/transam/xlogreader.c | 98 +++++++- src/backend/utils/misc/guc.c | 52 ++++- src/backend/utils/misc/postgresql.conf.sample | 8 +- src/bin/pg_waldump/pg_waldump.c | 20 +- doc/src/sgml/config.sgml | 37 ++- doc/src/sgml/install-windows.sgml | 2 +- doc/src/sgml/installation.sgml | 28 ++- configure | 217 ++++++++++++++++++ configure.ac | 33 +++ src/tools/msvc/Solution.pm | 2 + src/tools/msvc/config_default.pl | 1 + 17 files changed, 622 insertions(+), 49 deletions(-) diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 77187c12be..e7953572d0 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -116,7 +116,8 @@ extern char *XLogArchiveCommand; extern bool EnableHotStandby; extern bool fullPageWrites; extern bool wal_log_hints; -extern bool wal_compression; +extern int wal_compression; +extern int wal_compression_level; extern bool wal_init_zero; extern bool wal_recycle; extern bool *wal_consistency_checking; @@ -167,6 +168,31 @@ typedef enum WalLevel WAL_LEVEL_LOGICAL } WalLevel; +/* WAL compressions */ +typedef enum WalCompression +{ + WAL_COMPRESSION_NONE = 0, + WAL_COMPRESSION_PGLZ, + WAL_COMPRESSION_ZLIB, + WAL_COMPRESSION_LZ4, + WAL_COMPRESSION_ZSTD, +} WalCompression; + +/*--------- + * Compression methods supported for wal_compression have different + * ideas of what a compression level is, though they all use an integer + * to measure it: + * - ZLIB ranks from 0 to 9, with a default at 1. + * - LZ4 ranks from 1 to 65537, with a default at 1. + * - ZSTD ranks from -50 to 10, with a default at ZSTD_CLEVEL_DEFAULT. + * + * This default value is chosen so as it does not map with supported + * ranges for any of the supported methods listed above. When set to + * this value, the default compression level is used for each method. + *--------- + */ +#define WAL_COMPRESSION_LEVEL_DEFAULT -10000 + /* Recovery states */ typedef enum RecoveryState { diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h index 80c92a2498..cba42c10e7 100644 --- a/src/include/access/xlogrecord.h +++ b/src/include/access/xlogrecord.h @@ -114,8 +114,8 @@ typedef struct XLogRecordBlockHeader * present is (BLCKSZ - <length of "hole" bytes>). * * Additionally, when wal_compression is enabled, we will try to compress full - * page images using the PGLZ compression algorithm, after removing the "hole". - * This can reduce the WAL volume, but at some extra cost of CPU spent + * page images using one of the supported algorithms, after removing the + * "hole". This can reduce the WAL volume, but at some extra cost of CPU spent * on the compression during WAL logging. In this case, since the "hole" * length cannot be calculated by subtracting the number of page image bytes * from BLCKSZ, basically it needs to be stored as an extra information. @@ -144,9 +144,16 @@ typedef struct XLogRecordBlockImageHeader /* Information stored in bimg_info */ #define BKPIMAGE_HAS_HOLE 0x01 /* page image has "hole" */ -#define BKPIMAGE_IS_COMPRESSED 0x02 /* page image is compressed */ -#define BKPIMAGE_APPLY 0x04 /* page image should be restored during - * replay */ +#define BKPIMAGE_APPLY 0x02 /* page image should be restored + * during replay */ +/* compression methods supported */ +#define BKPIMAGE_COMPRESS_PGLZ 0x04 +#define BKPIMAGE_COMPRESS_ZLIB 0x08 +#define BKPIMAGE_COMPRESS_LZ4 0x10 +#define BKPIMAGE_COMPRESS_ZSTD 0x20 +#define BKPIMAGE_IS_COMPRESSED(info) \ + ((info & (BKPIMAGE_COMPRESS_PGLZ | BKPIMAGE_COMPRESS_ZLIB | \ + BKPIMAGE_COMPRESS_LZ4 | BKPIMAGE_COMPRESS_ZSTD)) != 0) /* * Extra header information used when page image has "hole" and diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index 783b8fc1ba..1951d88ac9 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -355,6 +355,9 @@ /* Define to 1 if you have the `z' library (-lz). */ #undef HAVE_LIBZ +/* Define to 1 if you have the `zstd' library (-lzstd). */ +#undef HAVE_LIBZSTD + /* Define to 1 if you have the `link' function. */ #undef HAVE_LINK @@ -722,6 +725,9 @@ /* Define to 1 if the assembler supports X86_64's POPCNTQ instruction. */ #undef HAVE_X86_64_POPCNTQ +/* Define to 1 if you have the <zstd.h> header file. */ +#undef HAVE_ZSTD_H + /* Define to 1 if the system has the type `_Bool'. */ #undef HAVE__BOOL @@ -953,6 +959,9 @@ /* Define to select Win32-style shared memory. */ #undef USE_WIN32_SHARED_MEMORY +/* Define to 1 to build with zstd support. (--with-zstd) */ +#undef USE_ZSTD + /* Define to 1 if `wcstombs_l' requires <xlocale.h>. */ #undef WCSTOMBS_L_IN_XLOCALE diff --git a/src/backend/Makefile b/src/backend/Makefile index 0da848b1fd..3af216ddfc 100644 --- a/src/backend/Makefile +++ b/src/backend/Makefile @@ -48,7 +48,7 @@ OBJS = \ LIBS := $(filter-out -lpgport -lpgcommon, $(LIBS)) $(LDAP_LIBS_BE) $(ICU_LIBS) # The backend doesn't need everything that's in LIBS, however -LIBS := $(filter-out -lz -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS)) +LIBS := $(filter-out -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS)) ifeq ($(with_systemd),yes) LIBS += -lsystemd diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 17eeff0720..21349e011c 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -98,7 +98,8 @@ char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; -bool wal_compression = false; +int wal_compression = WAL_COMPRESSION_NONE; +int wal_compression_level = -1; char *wal_consistency_checking_string = NULL; bool *wal_consistency_checking = NULL; bool wal_init_zero = true; diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c index 32b4cc84e7..b2ca1acc0d 100644 --- a/src/backend/access/transam/xloginsert.c +++ b/src/backend/access/transam/xloginsert.c @@ -33,8 +33,33 @@ #include "storage/proc.h" #include "utils/memutils.h" +/* XXX: this needs to be heavily cleaned up */ +#ifdef HAVE_LIBZ +#include <zlib.h> +/* zlib compressBound is not a macro */ +#define ZLIB_MAX_BLCKSZ BLCKSZ + (BLCKSZ>>12) + (BLCKSZ>>14) + (BLCKSZ>>25) + 13 +#else +#define ZLIB_MAX_BLCKSZ 0 +#endif + +#ifdef USE_LZ4 +#include <lz4.h> +#define LZ4_MAX_BLCKSZ LZ4_COMPRESSBOUND(BLCKSZ) +#else +#define LZ4_MAX_BLCKSZ 0 +#endif + +#ifdef USE_ZSTD +#include <zstd.h> +#define ZSTD_MAX_BLCKSZ ZSTD_COMPRESSBOUND(BLCKSZ) +#else +#define ZSTD_MAX_BLCKSZ 0 +#endif + /* Buffer size required to store a compressed version of backup block image */ -#define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) +#define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) + +#define COMPRESS_BUFSIZE Max(Max(Max(PGLZ_MAX_BLCKSZ, ZLIB_MAX_BLCKSZ), LZ4_MAX_BLCKSZ), ZSTD_MAX_BLCKSZ) /* * For each block reference registered with XLogRegisterBuffer, we fill in @@ -58,7 +83,7 @@ typedef struct * backup block data in XLogRecordAssemble() */ /* buffer to store a compressed version of backup block image */ - char compressed_page[PGLZ_MAX_BLCKSZ]; + char compressed_page[COMPRESS_BUFSIZE]; } registered_buffer; static registered_buffer *registered_buffers; @@ -628,7 +653,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info, /* * Try to compress a block image if wal_compression is enabled */ - if (wal_compression) + if (wal_compression != WAL_COMPRESSION_NONE) { is_compressed = XLogCompressBackupBlock(page, bimg.hole_offset, @@ -665,8 +690,39 @@ XLogRecordAssemble(RmgrId rmid, uint8 info, if (is_compressed) { + /* The current compression is stored in the WAL record */ bimg.length = compressed_len; - bimg.bimg_info |= BKPIMAGE_IS_COMPRESSED; + + /* Append the compression method used */ + switch (wal_compression) + { + case WAL_COMPRESSION_PGLZ: + bimg.bimg_info |= BKPIMAGE_COMPRESS_PGLZ; + break; + case WAL_COMPRESSION_ZLIB: +#ifdef HAVE_LIBZ + bimg.bimg_info |= BKPIMAGE_COMPRESS_ZLIB; +#else + elog(ERROR, "ZLIB is not supported by this build"); +#endif + break; + case WAL_COMPRESSION_LZ4: +#ifdef USE_LZ4 + bimg.bimg_info |= BKPIMAGE_COMPRESS_LZ4; +#else + elog(ERROR, "LZ4 is not supported by this build"); +#endif + break; + case WAL_COMPRESSION_ZSTD: +#ifdef USE_ZSTD + bimg.bimg_info |= BKPIMAGE_COMPRESS_ZSTD; +#else + elog(ERROR, "ZSTD is not supported by this build"); +#endif + break; + default: + elog(ERROR, "unsupported WAL compression method specified"); + } rdt_datas_last->data = regbuf->compressed_page; rdt_datas_last->len = compressed_len; @@ -853,12 +909,58 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, else source = page; + switch (wal_compression) + { + case WAL_COMPRESSION_PGLZ: + len = pglz_compress(source, orig_len, dest, PGLZ_strategy_default); + break; + +#ifdef HAVE_LIBZ + case WAL_COMPRESSION_ZLIB: + { + unsigned long len_l = COMPRESS_BUFSIZE; + int ret; + + ret = compress2((Bytef *) dest, &len_l, (Bytef *) + source, orig_len, + wal_compression_level == WAL_COMPRESSION_LEVEL_DEFAULT ? + 1 : wal_compression_level); + if (ret != Z_OK) + len_l = -1; + len = len_l; + break; + } +#endif + +#ifdef USE_LZ4 + case WAL_COMPRESSION_LZ4: + len = LZ4_compress_fast(source, dest, orig_len, COMPRESS_BUFSIZE, + wal_compression_level == WAL_COMPRESSION_LEVEL_DEFAULT ? + 1 : wal_compression_level); + if (len == 0) + len = -1; + break; +#endif + +#ifdef USE_ZSTD + case WAL_COMPRESSION_ZSTD: + len = ZSTD_compress(dest, COMPRESS_BUFSIZE, source, orig_len, + wal_compression_level == WAL_COMPRESSION_LEVEL_DEFAULT ? + ZSTD_CLEVEL_DEFAULT : wal_compression_level); + if (ZSTD_isError(len)) + len = -1; + break; +#endif + + default: + elog(ERROR, "unsupported WAL compression method specified"); + } + /* - * We recheck the actual size even if pglz_compress() reports success and + * We recheck the actual size even if compression reports success and * see if the number of bytes saved by compression is larger than the * length of extra data needed for the compressed version of block image. */ - len = pglz_compress(source, orig_len, dest, PGLZ_strategy_default); if (len >= 0 && len + extra_bytes < orig_len) { diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 42738eb940..f9dc0ee910 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -18,6 +18,15 @@ #include "postgres.h" #include <unistd.h> +#ifdef HAVE_LIBZ +#include <zlib.h> +#endif +#ifdef USE_LZ4 +#include <lz4.h> +#endif +#ifdef USE_ZSTD +#include <zstd.h> +#endif #include "access/transam.h" #include "access/xlog_internal.h" @@ -1290,7 +1299,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg) blk->apply_image = ((blk->bimg_info & BKPIMAGE_APPLY) != 0); - if (blk->bimg_info & BKPIMAGE_IS_COMPRESSED) + if (BKPIMAGE_IS_COMPRESSED(blk->bimg_info)) { if (blk->bimg_info & BKPIMAGE_HAS_HOLE) COPY_HEADER_FIELD(&blk->hole_length, sizeof(uint16)); @@ -1335,29 +1344,28 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg) } /* - * cross-check that bimg_len < BLCKSZ if the IS_COMPRESSED - * flag is set. + * cross-check that bimg_len < BLCKSZ if it's compressed */ - if ((blk->bimg_info & BKPIMAGE_IS_COMPRESSED) && + if (BKPIMAGE_IS_COMPRESSED(blk->bimg_info) && blk->bimg_len == BLCKSZ) { report_invalid_record(state, - "BKPIMAGE_IS_COMPRESSED set, but block image length %u at %X/%X", + "BKPIMAGE_IS_COMPRESSED, but block image length %u at %X/%X", (unsigned int) blk->bimg_len, LSN_FORMAT_ARGS(state->ReadRecPtr)); goto err; } /* - * cross-check that bimg_len = BLCKSZ if neither HAS_HOLE nor - * IS_COMPRESSED flag is set. + * cross-check that bimg_len = BLCKSZ if neither HAS_HOLE is + * set nor IS_COMPRESSED(). */ if (!(blk->bimg_info & BKPIMAGE_HAS_HOLE) && - !(blk->bimg_info & BKPIMAGE_IS_COMPRESSED) && + !BKPIMAGE_IS_COMPRESSED(blk->bimg_info) && blk->bimg_len != BLCKSZ) { report_invalid_record(state, - "neither BKPIMAGE_HAS_HOLE nor BKPIMAGE_IS_COMPRESSED set, but block image length is %u at %X/%X", + "neither BKPIMAGE_HAS_HOLE nor BKPIMAGE_IS_COMPRESSED, but block image length is %u at %X/%X", (unsigned int) blk->data_len, LSN_FORMAT_ARGS(state->ReadRecPtr)); goto err; @@ -1555,17 +1563,83 @@ RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page) bkpb = &record->blocks[block_id]; ptr = bkpb->bkp_image; - if (bkpb->bimg_info & BKPIMAGE_IS_COMPRESSED) + if (BKPIMAGE_IS_COMPRESSED(bkpb->bimg_info)) { /* If a backup block image is compressed, decompress it */ - if (pglz_decompress(ptr, bkpb->bimg_len, tmp.data, - BLCKSZ - bkpb->hole_length, true) < 0) + int32 decomp_result = -1; + + if ((bkpb->bimg_info & BKPIMAGE_COMPRESS_PGLZ) != 0) + { + decomp_result = pglz_decompress(ptr, bkpb->bimg_len, tmp.data, + BLCKSZ - bkpb->hole_length, true); + } + else if ((bkpb->bimg_info & BKPIMAGE_COMPRESS_ZLIB) != 0) + { +#ifdef HAVE_LIBZ + unsigned long decomp_result_l; + + decomp_result_l = BLCKSZ - bkpb->hole_length; + if (uncompress((Bytef*) tmp.data, &decomp_result_l, + (Bytef*) ptr, bkpb->bimg_len) == Z_OK) + decomp_result = decomp_result_l; + else + decomp_result = -1; +#else + report_invalid_record(record, "image at %X/%X compressed with %s not supported, block %d", + (uint32) (record->ReadRecPtr >> 32), + (uint32) record->ReadRecPtr, + "zlib", + block_id); +#endif + } + else if ((bkpb->bimg_info & BKPIMAGE_COMPRESS_LZ4) != 0) + { +#ifdef USE_LZ4 + decomp_result = LZ4_decompress_safe(ptr, tmp.data, + bkpb->bimg_len, BLCKSZ-bkpb->hole_length); +#else + report_invalid_record(record, "image at %X/%X compressed with %s not supported, block %d", + (uint32) (record->ReadRecPtr >> 32), + (uint32) record->ReadRecPtr, + "lz4", + block_id); +#endif + } + else if ((bkpb->bimg_info & BKPIMAGE_COMPRESS_ZSTD) != 0) + { +#ifdef USE_ZSTD + decomp_result = ZSTD_decompress(tmp.data, BLCKSZ-bkpb->hole_length, + ptr, bkpb->bimg_len); + /* + * XXX: ZSTD_getErrorName (I need to study those APIs more, + * perhaps). + */ + if (ZSTD_isError(decomp_result)) + decomp_result = -1; +#else + report_invalid_record(record, "image at %X/%X compressed with %s not supported, block %d", + (uint32) (record->ReadRecPtr >> 32), + (uint32) record->ReadRecPtr, + "zstd", + block_id); +#endif + } + else + { + report_invalid_record(record, "image at %X/%X is compressed with unknown method, block %d", + (uint32) (record->ReadRecPtr >> 32), + (uint32) record->ReadRecPtr, + block_id); + } + + if (decomp_result < 0) { report_invalid_record(record, "invalid compressed image at %X/%X, block %d", LSN_FORMAT_ARGS(record->ReadRecPtr), block_id); return false; } + ptr = tmp.data; } diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 68b62d523d..1ea332a27f 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -540,6 +540,28 @@ static struct config_enum_entry default_toast_compression_options[] = { {NULL, 0, false} }; +static const struct config_enum_entry wal_compression_options[] = { + {"pglz", WAL_COMPRESSION_PGLZ, false}, +#ifdef HAVE_LIBZ + {"zlib", WAL_COMPRESSION_ZLIB, false}, +#endif +#ifdef USE_LZ4 + {"lz4", WAL_COMPRESSION_LZ4, false}, +#endif +#ifdef USE_ZSTD + {"zstd", WAL_COMPRESSION_ZSTD, false}, +#endif + {"on", WAL_COMPRESSION_PGLZ, false}, + {"off", WAL_COMPRESSION_NONE, false}, + {"true", WAL_COMPRESSION_PGLZ, true}, + {"false", WAL_COMPRESSION_NONE, true}, + {"yes", WAL_COMPRESSION_PGLZ, true}, + {"no", WAL_COMPRESSION_NONE, true}, + {"1", WAL_COMPRESSION_PGLZ, true}, + {"0", WAL_COMPRESSION_NONE, true}, + {NULL, 0, false} +}; + /* * Options for enum values stored in other modules */ @@ -1304,16 +1326,6 @@ static struct config_bool ConfigureNamesBool[] = NULL, NULL, NULL }, - { - {"wal_compression", PGC_SUSET, WAL_SETTINGS, - gettext_noop("Compresses full-page writes written in WAL file."), - NULL - }, - &wal_compression, - false, - NULL, NULL, NULL - }, - { {"wal_init_zero", PGC_SUSET, WAL_SETTINGS, gettext_noop("Writes zeroes to new WAL files before first use."), @@ -2261,6 +2273,16 @@ static struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"wal_compression_level", PGC_SUSET, CONN_AUTH_SETTINGS, + gettext_noop("Sets the compression level used with wal_compression."), + NULL + }, + &wal_compression_level, + WAL_COMPRESSION_LEVEL_DEFAULT, INT_MIN, INT_MAX, + NULL, NULL, NULL + }, + { {"wal_receiver_status_interval", PGC_SIGHUP, REPLICATION_STANDBY, gettext_noop("Sets the maximum interval between WAL receiver status reports to the sending server."), @@ -4815,6 +4837,16 @@ static struct config_enum ConfigureNamesEnum[] = NULL, NULL, NULL }, + { + {"wal_compression", PGC_SUSET, WAL_SETTINGS, + gettext_noop("Set the method used to compress full page images in the WAL."), + NULL + }, + &wal_compression, + WAL_COMPRESSION_NONE, wal_compression_options, + NULL, NULL, NULL + }, + { {"wal_level", PGC_POSTMASTER, WAL_SETTINGS, gettext_noop("Sets the level of information written to the WAL."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index ddbb6dc2be..50c7c90fab 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -218,7 +218,13 @@ #full_page_writes = on # recover from partial page writes #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) -#wal_compression = off # enable compression of full-page writes +#wal_compression = off # enables compression of full-page writes; + # off, pglz, zlib, lz4, zstd, or on +#wal_compression_level = -10000 # compression level applied with + # wal_compression. Depends on the + # compression algorithm used. -10000 + # implies to use the default of each + # algorithm. #wal_init_zero = on # zero-fill new WAL files #wal_recycle = on # recycle WAL files #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index f8b8afe4a7..7e94cf31e8 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -537,18 +537,30 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record) blk); if (XLogRecHasBlockImage(record, block_id)) { - if (record->blocks[block_id].bimg_info & - BKPIMAGE_IS_COMPRESSED) + uint8 bimg_info = record->blocks[block_id].bimg_info; + + if (BKPIMAGE_IS_COMPRESSED(bimg_info)) { + const char *method = "???"; + if ((bimg_info & BKPIMAGE_COMPRESS_PGLZ) != 0) + method = "pglz"; + else if ((bimg_info & BKPIMAGE_COMPRESS_ZLIB) != 0) + method = "zlib"; + else if ((bimg_info & BKPIMAGE_COMPRESS_LZ4) != 0) + method = "lz4"; + else if ((bimg_info & BKPIMAGE_COMPRESS_ZSTD) != 0) + method = "zstd"; + printf(" (FPW%s); hole: offset: %u, length: %u, " - "compression saved: %u", + "compression saved: %u, method: %s", XLogRecBlockImageApply(record, block_id) ? "" : " for WAL verification", record->blocks[block_id].hole_offset, record->blocks[block_id].hole_length, BLCKSZ - record->blocks[block_id].hole_length - - record->blocks[block_id].bimg_len); + record->blocks[block_id].bimg_len, + method); } else { diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index aa3e178240..f6322ed425 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3127,23 +3127,32 @@ include_dir 'conf.d' </varlistentry> <varlistentry id="guc-wal-compression" xreflabel="wal_compression"> - <term><varname>wal_compression</varname> (<type>boolean</type>) + <term><varname>wal_compression</varname> (<type>enum</type>) <indexterm> <primary><varname>wal_compression</varname> configuration parameter</primary> </indexterm> </term> <listitem> <para> - When this parameter is <literal>on</literal>, the <productname>PostgreSQL</productname> + This parameter enables compression of WAL using the specified + compression method. + When enabled, the <productname>PostgreSQL</productname> server compresses full page images written to WAL when <xref linkend="guc-full-page-writes"/> is on or during a base backup. A compressed page image will be decompressed during WAL replay. + The supported methods are pglz, <literal>zlib</literal> + (if <productname>PostgreSQL</productname> was not built with + <option>--without-zlib</option>), <literal>lz4</literal> + (if <productname>PostgreSQL</productname> was compiled with + <option>--with-lz4</option>) and <literal>zstd</literal> + (if <productname>PostgreSQL</productname> was compiled with + <option>--with-zstd</option>). The default value is <literal>off</literal>. Only superusers can change this setting. </para> <para> - Turning this parameter on can reduce the WAL volume without + Enabling compression can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay. @@ -3151,6 +3160,28 @@ include_dir 'conf.d' </listitem> </varlistentry> + <varlistentry id="guc-wal-compression-level" xreflabel="wal_compression_level"> + <term><varname>wal_compression_level</varname> (<type>enum</type>) + <indexterm> + <primary><varname>wal_compression_level</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + If set, controls the level of compression used with + <varname>wal_compression</varname>. The range of levels supported + depends on the compression method used. The default is + <literal>-10000</literal>, to use the default level compression + recommended by a given method. There should be no need to change + this parameter, except for users experienced with compression + algorithms. + </para> + <para> + Only superusers can change this setting. + </para> + </listitem> + </varlistentry> + <varlistentry id="guc-wal-init-zero" xreflabel="wal_init_zero"> <term><varname>wal_init_zero</varname> (<type>boolean</type>) <indexterm> diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml index 312edc6f7a..ba794b8c93 100644 --- a/doc/src/sgml/install-windows.sgml +++ b/doc/src/sgml/install-windows.sgml @@ -299,7 +299,7 @@ $ENV{MSBFLAGS}="/m"; <term><productname>LZ4</productname></term> <listitem><para> Required for supporting <productname>LZ4</productname> compression - method for compressing the table data. Binaries and source can be + method for compressing table or WAL data. Binaries and source can be downloaded from <ulink url="https://github.com/lz4/lz4/releases"></ulink>. </para></listitem> diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 3c0aa118c7..3e985bbd05 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -147,7 +147,7 @@ su - postgres specify the <option>--without-zlib</option> option to <filename>configure</filename>. Using this option disables support for compressed archives in <application>pg_dump</application> and - <application>pg_restore</application>. + <application>pg_restore</application>, and compressed WAL. </para> </listitem> </itemizedlist> @@ -270,7 +270,16 @@ su - postgres <para> You need <productname>LZ4</productname>, if you want to support compression of data with this method; see - <xref linkend="guc-default-toast-compression"/>. + <xref linkend="guc-default-toast-compression"/> and + <xref linkend="guc-wal-compression"/>. + </para> + </listitem> + + <listitem> + <para> + The <productname>ZSTD</productname> library can be used to enable + compression using that method; see + <xref linkend="guc-wal-compression"/>. </para> </listitem> @@ -980,7 +989,18 @@ build-postgresql: <para> Build with <productname>LZ4</productname> compression support. This allows the use of <productname>LZ4</productname> for - compression of table data. + compression of table and WAL data. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>--with-zstd</option></term> + <listitem> + <para> + Build with <productname>ZSTD</productname> compression support. + This enables use of <productname>ZSTD</productname> for + compression of WAL data. </para> </listitem> </varlistentry> @@ -1236,7 +1256,7 @@ build-postgresql: Prevents use of the <application>Zlib</application> library. This disables support for compressed archives in <application>pg_dump</application> - and <application>pg_restore</application>. + and <application>pg_restore</application> and compressed WAL. </para> </listitem> </varlistentry> diff --git a/configure b/configure index e9b98f442f..5317911100 100755 --- a/configure +++ b/configure @@ -699,6 +699,9 @@ with_gnu_ld LD LDFLAGS_SL LDFLAGS_EX +ZSTD_LIBS +ZSTD_CFLAGS +with_zstd LZ4_LIBS LZ4_CFLAGS with_lz4 @@ -868,6 +871,7 @@ with_libxslt with_system_tzdata with_zlib with_lz4 +with_zstd with_gnu_ld with_ssl with_openssl @@ -897,6 +901,8 @@ XML2_CFLAGS XML2_LIBS LZ4_CFLAGS LZ4_LIBS +ZSTD_CFLAGS +ZSTD_LIBS LDFLAGS_EX LDFLAGS_SL PERL @@ -1576,6 +1582,7 @@ Optional Packages: use system time zone data in DIR --without-zlib do not use Zlib --with-lz4 build with LZ4 support + --with-zstd build without Zstd compression library --with-gnu-ld assume the C compiler uses GNU ld [default=no] --with-ssl=LIB use LIB for SSL/TLS support (openssl) --with-openssl obsolete spelling of --with-ssl=openssl @@ -1605,6 +1612,8 @@ Some influential environment variables: XML2_LIBS linker flags for XML2, overriding pkg-config LZ4_CFLAGS C compiler flags for LZ4, overriding pkg-config LZ4_LIBS linker flags for LZ4, overriding pkg-config + ZSTD_CFLAGS C compiler flags for ZSTD, overriding pkg-config + ZSTD_LIBS linker flags for ZSTD, overriding pkg-config LDFLAGS_EX extra linker flags for linking executables only LDFLAGS_SL extra linker flags for linking shared libraries only PERL Perl program @@ -8713,6 +8722,147 @@ fi done fi +# +# ZSTD +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with zstd support" >&5 +$as_echo_n "checking whether to build with zstd support... " >&6; } + + + +# Check whether --with-zstd was given. +if test "${with_zstd+set}" = set; then : + withval=$with_zstd; + case $withval in + yes) + +$as_echo "#define USE_ZSTD 1" >>confdefs.h + + ;; + no) + : + ;; + *) + as_fn_error $? "no argument expected for --with-zstd option" "$LINENO" 5 + ;; + esac + +else + with_zstd=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_zstd" >&5 +$as_echo "$with_zstd" >&6; } + + +if test "$with_zstd" = yes; then + +pkg_failed=no +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for libzstd" >&5 +$as_echo_n "checking for libzstd... " >&6; } + +if test -n "$ZSTD_CFLAGS"; then + pkg_cv_ZSTD_CFLAGS="$ZSTD_CFLAGS" + elif test -n "$PKG_CONFIG"; then + if test -n "$PKG_CONFIG" && \ + { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libzstd\""; } >&5 + ($PKG_CONFIG --exists --print-errors "libzstd") 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + pkg_cv_ZSTD_CFLAGS=`$PKG_CONFIG --cflags "libzstd" 2>/dev/null` + test "x$?" != "x0" && pkg_failed=yes +else + pkg_failed=yes +fi + else + pkg_failed=untried +fi +if test -n "$ZSTD_LIBS"; then + pkg_cv_ZSTD_LIBS="$ZSTD_LIBS" + elif test -n "$PKG_CONFIG"; then + if test -n "$PKG_CONFIG" && \ + { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libzstd\""; } >&5 + ($PKG_CONFIG --exists --print-errors "libzstd") 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + pkg_cv_ZSTD_LIBS=`$PKG_CONFIG --libs "libzstd" 2>/dev/null` + test "x$?" != "x0" && pkg_failed=yes +else + pkg_failed=yes +fi + else + pkg_failed=untried +fi + + + +if test $pkg_failed = yes; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } + +if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then + _pkg_short_errors_supported=yes +else + _pkg_short_errors_supported=no +fi + if test $_pkg_short_errors_supported = yes; then + ZSTD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libzstd" 2>&1` + else + ZSTD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libzstd" 2>&1` + fi + # Put the nasty error message in config.log where it belongs + echo "$ZSTD_PKG_ERRORS" >&5 + + as_fn_error $? "Package requirements (libzstd) were not met: + +$ZSTD_PKG_ERRORS + +Consider adjusting the PKG_CONFIG_PATH environment variable if you +installed software in a non-standard prefix. + +Alternatively, you may set the environment variables ZSTD_CFLAGS +and ZSTD_LIBS to avoid the need to call pkg-config. +See the pkg-config man page for more details." "$LINENO" 5 +elif test $pkg_failed = untried; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } + { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 +$as_echo "$as_me: error: in \`$ac_pwd':" >&2;} +as_fn_error $? "The pkg-config script could not be found or is too old. Make sure it +is in your PATH or set the PKG_CONFIG environment variable to the full +path to pkg-config. + +Alternatively, you may set the environment variables ZSTD_CFLAGS +and ZSTD_LIBS to avoid the need to call pkg-config. +See the pkg-config man page for more details. + +To get pkg-config, see <http://pkg-config.freedesktop.org/>. +See \`config.log' for more details" "$LINENO" 5; } +else + ZSTD_CFLAGS=$pkg_cv_ZSTD_CFLAGS + ZSTD_LIBS=$pkg_cv_ZSTD_LIBS + { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 +$as_echo "yes" >&6; } + +fi + # We only care about -I, -D, and -L switches; + # note that -lzstd will be added by AC_CHECK_LIB below. + for pgac_option in $ZSTD_CFLAGS; do + case $pgac_option in + -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";; + esac + done + for pgac_option in $ZSTD_LIBS; do + case $pgac_option in + -L*) LDFLAGS="$LDFLAGS $pgac_option";; + esac + done +fi + # # Assignments # @@ -12876,6 +13026,56 @@ fi fi +if test "$with_zstd" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ZSTD_compress in -lzstd" >&5 +$as_echo_n "checking for ZSTD_compress in -lzstd... " >&6; } +if ${ac_cv_lib_zstd_ZSTD_compress+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_check_lib_save_LIBS=$LIBS +LIBS="-lzstd $LIBS" +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char ZSTD_compress (); +int +main () +{ +return ZSTD_compress (); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + ac_cv_lib_zstd_ZSTD_compress=yes +else + ac_cv_lib_zstd_ZSTD_compress=no +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext +LIBS=$ac_check_lib_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_zstd_ZSTD_compress" >&5 +$as_echo "$ac_cv_lib_zstd_ZSTD_compress" >&6; } +if test "x$ac_cv_lib_zstd_ZSTD_compress" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_LIBZSTD 1 +_ACEOF + + LIBS="-lzstd $LIBS" + +else + as_fn_error $? "library 'zstd' is required for ZSTD support" "$LINENO" 5 +fi + +fi + # Note: We can test for libldap_r only after we know PTHREAD_LIBS if test "$with_ldap" = yes ; then _LIBS="$LIBS" @@ -13598,6 +13798,23 @@ done fi +if test "$with_zstd" = yes; then + for ac_header in zstd.h +do : + ac_fn_c_check_header_mongrel "$LINENO" "zstd.h" "ac_cv_header_zstd_h" "$ac_includes_default" +if test "x$ac_cv_header_zstd_h" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_ZSTD_H 1 +_ACEOF + +else + as_fn_error $? "zstd.h header file is required for zstd" "$LINENO" 5 +fi + +done + +fi + if test "$with_gssapi" = yes ; then for ac_header in gssapi/gssapi.h do : diff --git a/configure.ac b/configure.ac index 3b42d8bdc9..56aa15b8e1 100644 --- a/configure.ac +++ b/configure.ac @@ -1011,6 +1011,31 @@ if test "$with_lz4" = yes; then done fi +# +# ZSTD +# +AC_MSG_CHECKING([whether to build with zstd support]) +PGAC_ARG_BOOL(with, zstd, no, [build without Zstd compression library], + [AC_DEFINE([USE_ZSTD], 1, [Define to 1 to build with zstd support. (--with-zstd)])]) +AC_MSG_RESULT([$with_zstd]) +AC_SUBST(with_zstd) + +if test "$with_zstd" = yes; then + PKG_CHECK_MODULES(ZSTD, libzstd) + # We only care about -I, -D, and -L switches; + # note that -lzstd will be added by AC_CHECK_LIB below. + for pgac_option in $ZSTD_CFLAGS; do + case $pgac_option in + -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";; + esac + done + for pgac_option in $ZSTD_LIBS; do + case $pgac_option in + -L*) LDFLAGS="$LDFLAGS $pgac_option";; + esac + done +fi + # # Assignments # @@ -1285,6 +1310,10 @@ if test "$with_lz4" = yes ; then AC_CHECK_LIB(lz4, LZ4_compress_default, [], [AC_MSG_ERROR([library 'lz4' is required for LZ4 support])]) fi +if test "$with_zstd" = yes ; then + AC_CHECK_LIB(zstd, ZSTD_compress, [], [AC_MSG_ERROR([library 'zstd' is required for ZSTD support])]) +fi + # Note: We can test for libldap_r only after we know PTHREAD_LIBS if test "$with_ldap" = yes ; then _LIBS="$LIBS" @@ -1443,6 +1472,10 @@ if test "$with_lz4" = yes; then AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])]) fi +if test "$with_zstd" = yes; then + AC_CHECK_HEADERS(zstd.h, [], [AC_MSG_ERROR([zstd.h header file is required for zstd])]) +fi + if test "$with_gssapi" = yes ; then AC_CHECK_HEADERS(gssapi/gssapi.h, [], [AC_CHECK_HEADERS(gssapi.h, [], [AC_MSG_ERROR([gssapi.h header file is required for GSSAPI])])]) diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm index a7b8f720b5..133de6fba6 100644 --- a/src/tools/msvc/Solution.pm +++ b/src/tools/msvc/Solution.pm @@ -494,6 +494,8 @@ sub GenerateFiles USE_LIBXML => undef, USE_LIBXSLT => undef, USE_LZ4 => undef, + # XXX; support for zstd is still required here. + USE_ZSTD => $self->{options}->{zstd} ? 1 : undef, USE_LDAP => $self->{options}->{ldap} ? 1 : undef, USE_LLVM => undef, USE_NAMED_POSIX_SEMAPHORES => undef, diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl index 460c0375d4..b8a1aac3c2 100644 --- a/src/tools/msvc/config_default.pl +++ b/src/tools/msvc/config_default.pl @@ -26,6 +26,7 @@ our $config = { xslt => undef, # --with-libxslt=<path> iconv => undef, # (not in configure, path to iconv) zlib => undef # --with-zlib=<path> + zstd => undef # --with-zstd=<path> }; 1; -- 2.32.0
From 51a7b458db74c1c285688ab9991fe590c9c97357 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Wed, 16 Jun 2021 14:49:22 +0900 Subject: [PATCH v10 2/2] Add tweak to test CPU usage within a session, for wal_compression --- src/backend/tcop/postgres.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 8cea10c901..2d7affe115 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -178,6 +178,10 @@ static bool RecoveryConflictPending = false; static bool RecoveryConflictRetryable = true; static ProcSignalReason RecoveryConflictReason; +/* Amount of user and system time used, tracked at start */ +static struct timeval user_time; +static struct timeval system_time; + /* reused buffer to pass to SendRowDescriptionMessage() */ static MemoryContext row_description_context = NULL; static StringInfoData row_description_buf; @@ -3937,6 +3941,12 @@ PostgresMain(int argc, char *argv[], volatile bool send_ready_for_query = true; bool idle_in_transaction_timeout_enabled = false; bool idle_session_timeout_enabled = false; + struct rusage r; + + /* Get start usage for reference point */ + getrusage(RUSAGE_SELF, &r); + memcpy((char *) &user_time, (char *) &r.ru_utime, sizeof(user_time)); + memcpy((char *) &system_time, (char *) &r.ru_stime, sizeof(system_time)); /* Initialize startup process environment if necessary. */ if (!IsUnderPostmaster) @@ -4683,6 +4693,14 @@ PostgresMain(int argc, char *argv[], case 'X': + /* Get stop status of process and log comparison with start */ + getrusage(RUSAGE_SELF, &r); + elog(LOG,"user diff: %ld.%06ld, system diff: %ld.%06ld", + (long) (r.ru_utime.tv_sec - user_time.tv_sec), + (long) (r.ru_utime.tv_usec - user_time.tv_usec), + (long) (r.ru_stime.tv_sec - system_time.tv_sec), + (long) (r.ru_stime.tv_usec - system_time.tv_usec)); + /* * Reset whereToSendOutput to prevent ereport from attempting * to send any more messages to client. -- 2.32.0
compression_results.sql
Description: application/sql
signature.asc
Description: PGP signature