The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hi,

I've compared the different libpq compression approaches in the streaming 
physical replication scenario.

Test setup
Three hosts: first is used for pg_restore run, second is master, third is the 
standby replica.
In each test run, I've run the pg_restore of the IMDB database 
(https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2QYZBT)
 
and measured the received traffic on the standby replica.

Also, I've enlarged the ZPQ_BUFFER_SIZE buffer in all versions because too 
small buffer size (8192 bytes) lead to more 
system calls to socket read/write and poor compression in the chunked-reset 
scenario.

Scenarios:

chunked
use streaming compression, wrap compressed data into CompressedData messages 
and preserve the compression context between multiple CompressedData messages.
https://github.com/usernamedt/libpq_compression/tree/chunked-compression

chunked-reset
use streaming compression, wrap compressed data into CompressedData messages 
and reset the compression context on each CompressedData message.
https://github.com/usernamedt/libpq_compression/tree/chunked-reset

permanent
use streaming compression, send raw compressed stream without any wrapping
https://github.com/usernamedt/libpq_compression/tree/permanent-w-enlarged-buffer

Tested compression levels
ZSTD, level 1
ZSTD, level 5
ZSTD, level 9

Scenario                        Replica rx, mean, MB
uncompressed    6683.6


ZSTD, level 1
Scenario                        Replica rx, mean, MB
chunked-reset           2726
chunked                 2694
permanent               2694.3

ZSTD, level 5
Scenario                        Replica rx, mean, MB
chunked-reset           2234.3
chunked                 2123
permanent               2115.3

ZSTD, level 9
Scenario                        Replica rx, mean, MB
chunked-reset           2153.6
chunked                 1943
permanent               1941.6

Full report with additional data and resource usage graphs is available here
https://docs.google.com/document/d/1a5bj0jhtFMWRKQqwu9ag1PgDF5fLo7Ayrw3Uh53VEbs

Based on these results, I suggest sticking with chunked compression approach
which introduces more flexibility and contains almost no overhead compared to 
permanent compression.
Also, later we may introduce some setting to control should we reset the 
compression context in each message without
breaking the backward compatibility.

--
Daniil Zakhlystov

The new status of this patch is: Ready for Committer

Reply via email to