Hi!

I made some noticeable changes to the patch and fixed the previously mentioned 
issues.

On Fri, Mar 19, 2021 at 16:28 AM Justin Pryzby <pry...@telsasoft.com> wrote:
> Previously, I suggested that the server should have a "policy" GUC defining
> which compression methods are allowed.  Possibly including compression 
> "level".
> For example, the default might be to allow zstd, but only up to level 9.

Now libpq_compression GUC server setting controls the available client-server 
traffic compression methods.
It allows specifying the exact list of the allowed compression algorithms.

For example, to allow only and zlib, set the setting to "zstd,zlib". Also, 
maximal allowed compression level can be specified for each 
method, e.g. "zstd:1,zlib:2" setting will set the maximal compression level for 
zstd to 1 and zlib to 2. 
If a client requests the compression with a higher compression level, it will 
be set to the maximal allowed one. 
The default (and recommended) maximal compression level for each algorithm is 1.

On Fri, Mar 19, 2021 at 16:28 AM Justin Pryzby <pry...@telsasoft.com> wrote:
> There's commit messages claiming that the compression can be "asymmetric"
> between client-server vs server-client, but the commit seems to be unrelated,
> and the protocol documentation doesn't describe how this works.

On Dec 10, 2020, at 1:39 AM, Robert Haas <robertmh...@gmail.com> wrote:
> Another idea is that you could have a new message type that says "hey,
> the payload of this is 1 or more compressed messages." It uses the
> most-recently set compression method. This would make switching
> compression methods easier since the SetCompressionMethod message
> itself could always be sent uncompressed and/or not take effect until
> the next compressed message


I rewrote the compression initialization logic. Now it is asymmetric and the 
compression method is changeable on the fly.

Now compression initialization consists only of two phases:
1. Client sends startup packet with _pq_.compression containing the list of 
compression algorithms specified by the client with an optional
specification of compression level (e.g. “zstd:2,zlib:1”)
2. The server intersects the requested compression algorithms with the allowed 
ones (controlled via the libpq_compression server config setting).
If the intersection is not empty, the server responds with CompressionAck 
containing the final list of the compression algorithms that can be used for 
the compression of libpq messages between the client and server.
If the intersection is empty (server does not accept any of the requested 
algorithms), then it replies with CompressionAck containing the empty list.

After sending the CompressionAck message, the server can send the 
SetCompressionMethod message to set the current compression algorithm for 
server-to-client traffic compression.
After receiving the CompressionAck message, the client can send the 
SetCompressionMethod message to set the current compression algorithm for 
client-to-server traffic compression.

SetCompressionMethod message contains the index of the compression algorithm in 
the final list of the compression algorithms which is generated during 
compression initialization (described above).

Compressed data is wrapped into CompressedData messages.

Rules for compressing the protocol messages are defined in zpq_stream.c. For 
each protocol message, the preferred compression algorithm can be chosen.

On Wed, Apr 21, 2021 at 15:35 AM Ian Zagorskikh <izagorsk...@cloudlinux.com> 
wrote:
> Let me drop my humble 2 cents in this thread. At this moment I checked only 
> 0001-Add-zlib-and-zstd-streaming-compression patch. With no order:
> 
> * No checks for failures. For example, value from malloc() is not checked for 
> NULL and used as-is right after the call. Also as a result possibly NULL 
> values passed into ZSTD functions which are explicitly not NULL-tolerant and 
> so dereference pointers without checks.
> 
> * Used memory management does not follow the schema used in the common 
> module. Direct calls to std C malloc/free are hard coded. AFAIU this is not 
> backend friendly. Looking at the code around I believe they should be wrapped 
> to ALLOC/FREE local macroses that depend on FRONTEND define. As it's done for 
> example. in src/common/hmac.c.
> 
> * If we're going to fix our code to be palloc/pfree friendly there's also a 
> way to pass our custom allocator callbacks inside ZSTD functions. By default 
> ZSTD uses malloc/free but it can be overwritten by the caller in e.g. 
> ZSTD_createDStream_advanced(ZSTD_customMem customMem) versions of API . IMHO 
> that would be good. If a 3rd party component allows us to inject  a custom 
> memory allocator and we do have it why not use this feature?
> 
> * Most zs_foo() functions do not check ptr arguments, though some like 
> zs_free() do. If we're speaking about a "common API" that can be used by a 
> wide range of different modules around the project, such a relaxed way to 
> treat input arguments IMHO can be an evil. Or at least add Assert(ptr) 
> assertions so they can be catched up in debug mode.
> 
> * For zs_create() function which must be called first to create context for 
> further operations, a compression method is passed as integer. This method is 
> used inside zs_create() as index inside an array of compress implementation 
> descriptors. There are several problems:
> 1) Method ID value is not checked for validity. By passing an invalid method 
> ID we can easily get out of array bounds.
> 2) Content of array depends on on configuration options e.g. HAVE_LIBZSTD, 
> HAVE_LIBZ etc. So index inside this array is not persistent. For some config 
> options combination method ID with value 0 may refer to ZSTD and for others 
> to zlib.
> 3) There's no defines/enums/etc in public z_stream.h header that would define 
> which value should be passed to create a specific compressor. Users have to 
> guess it or check the code.

Fixed almost all of these issues, except the malloc/free-related stuff (will 
fix this later).

On Fri, Mar 19, 2021 at 11:02 AM Justin Pryzby <pry...@telsasoft.com> wrote:
> Also, I'm not sure, but I think you may find that the zstd configure.ac should
> use pkgconfig.  This allowed the CIs to compile these patches.  Without
> pkg-config, the macos CI couldn't find (at least) LZ4.k
> https://commitfest.postgresql.org/32/2813/
> https://commitfest.postgresql.org/32/3015/


Now --with-zstd uses pkg-config to link the ZSTD library and works correctly on 
macos.

I would appreciate hearing your thoughts on the new version of the patch,

Daniil Zakhlystov



Reply via email to