Re: libpq compression

Konstantin Knizhnik Thu, 21 Jun 2018 00:13:17 -0700



On 20.06.2018 23:34, Robbie Harwood wrote:

Konstantin Knizhnik <[email protected]> writes:


My idea was the following: client want to use compression. But server
may reject this attempt (for any reasons: it doesn't support it, has
no proper compression library, do not want to spend CPU for
decompression,...) Right now compression algorithm is hardcoded. But
in future client and server may negotiate to choose proper compression
protocol.  This is why I prefer to perform negotiation between client
and server to enable compression.
Well, for negotiation you could put the name of the algorithm you want
in the startup.  It doesn't have to be a boolean for compression, and
then you don't need an additional round-trip.


Sorry, I can only repeat arguments I already mentioned:
- in future it may be possible to specify compression algorithm

- even with boolean compression option server may have some reasons toreject client's request to use compression

Extra flexibility is always good thing if it doesn't cost too much. Andextra round of negotiation in case of enabling compression seems to menot to be a high price for it.


Well, that's a design decision you've made.  You could put lengths on
chunks that are sent out - then you'd know exactly how much is needed.
(For instance, 4 bytes of network-order length followed by a complete
payload.)  Then you'd absolutely know whether you have enough to
decompress or not.


Do you really suggest to send extra header for each chunk of data?

Please notice that chunk can be as small as one message: dozen of bytesbecause libpq is used for client-server communication with request-replypattern.


Frankly speaking I do not completely understand the source of your concern.

My primary idea was to preseve behavior of libpq function as much aspossible, so there is no need to rewrite all places in Postgres codewhen them are used.It seems to me that I succeed in reaching this goal. Incase of enabledcompression zpq_stream functions (zpq-read/write) are used instead of(pq)secure_read/write and in turn are using them to fetch/send moredata. I do not see any bad flaws, encapsulation violation or some otherproblems in such solution.

So before discussing some alternative ways of embedding compression inlibpq, I will want to understand what's wrong with this approach.

So loop should be something like this:

   decompress(ptr, n)
          ZSTD_inBuffer in = {0}
          ZSTD_outBuffer out = {0}

          in.src = ptr
          in.size = n

          while true
              ret = ZSTD_decompressStream(out, in)
              if ZSTD_isError(ret):
                  give_up()
            if out.pos != 0
                // if we deomcpress soemthing
                return out.pos;
            read_input(in);

The last line is what I'm taking issue with.  The interface we have
already in postgres's network code has a notion of partial reads, or
that reads might not return data.  (Same with writing and partial
writes.)  So you'd buffer what compressed data you have and return -
this is the preferred idiom so that we don't block or spin on a
nonblocking socket.

If socket is in non-blocking mode and there is available data, thensecure_read function will also immediately return 0.The pseudocode above is not quite correct. Let me show the realimplementation of zpq_read:


ssize_t
zpq_read(ZpqStream *zs, void *buf, size_t size, size_t *processed)
{
    ssize_t rc;
    ZSTD_outBuffer out;
    out.dst = buf;
    out.pos = 0;
    out.size = size;

    while (1)
    {
        rc = ZSTD_decompressStream(zs->rx_stream, &out, &zs->rx);
        if (ZSTD_isError(rc))
        {
            zs->rx_error = ZSTD_getErrorName(rc);
            return ZPQ_DECOMPRESS_ERROR;
        }

/* Return result if we fill requested amount of bytes or readoperation was performed */

        if (out.pos != 0)
        {
            zs->rx_total_raw += out.pos;
            return out.pos;
        }
        if (zs->rx.pos == zs->rx.size)
        {
            zs->rx.pos = zs->rx.size = 0; /* Reset rx buffer */
        }

rc = zs->rx_func(zs->arg, (char*)zs->rx.src + zs->rx.size,ZPQ_BUFFER_SIZE - zs->rx.size);

        if (rc > 0) /* read fetches some data */
        {
            zs->rx.size += rc;
            zs->rx_total += rc;
        }
        else /* read failed */
        {
            *processed = out.pos;
            zs->rx_total_raw += out.pos;
            return rc;
        }
    }
}

Sorry, but I have spent quite enough time trying to provide the samebehavior of zpq_read/write as of secure_read/write both for blocking andnon-blocking mode.And I hope that now it is preserved. And frankly speaking I do not seemuch differences of this approach with supporting TLS.Current implementation allows to combine compression with TLS and insome cases it may be really useful.


This is how the TLS code works already.  Look at, for instance,
pgtls_read().  If we get back SSL_ERROR_WANT_READ (i.e., not enough data
to decrypt), we return no data and wait until the socket becomes
readable again.

Thanks,
--Robbie


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: libpq compression

Reply via email to