On 21.06.2018 20:14, Robbie Harwood wrote:
Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes:
On 21.06.2018 17:56, Robbie Harwood wrote:
Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes:
On 20.06.2018 23:34, Robbie Harwood wrote:
Konstantin Knizhnik <k.knizh...@postgrespro.ru> writes:
Well, that's a design decision you've made. You could put lengths
on chunks that are sent out - then you'd know exactly how much is
needed. (For instance, 4 bytes of network-order length followed by
a complete payload.) Then you'd absolutely know whether you have
enough to decompress or not.
Do you really suggest to send extra header for each chunk of data?
Please notice that chunk can be as small as one message: dozen of
bytes because libpq is used for client-server communication with
request-reply pattern.
I want you to think critically about your design. I *really* don't
want to design it for you - I have enough stuff to be doing. But
again, the design I gave you doesn't necessarily need that - you just
need to properly buffer incomplete data.
Right now secure_read may return any number of available bytes. But in
case of using streaming compression, it can happen that available
number of bytes is not enough to perform decompression. This is why we
may need to try to fetch additional portion of data. This is how
zpq_stream is working now.
No, you need to buffer and wait until you're called again. Which is to
say: decompress() shouldn't call secure_read(). secure_read() should
call decompress().
I this case I will have to implement this code twice: both for backend
and frontend, i.e. for secure_read/secure_write and
pqsecure_read/pqsecure_write.
Frankly speaking i was very upset by design of libpq communication layer
in Postgres: there are two different implementations of almost the same
stuff for cbackend and frontend.
I do not see any meaningful argument for it except "historical reasons".
The better decision was to encapsulate socket communication layer (and
some other system dependent stuff) in SAL (system abstraction layer) and
use it both in backend and frontend.
By passing secure_read/pqsecure_read functions to zpq_stream I managed
to avoid such code duplication at least for compression.
I do not understand how it is possible to implement in different way
and what is wrong with current implementation.
The most succinct thing I can say is: absolutely don't modify
pq_recvbuf(). I gave you pseudocode for how to do that. All of your
logic should be *below* the secure_read/secure_write functions.
I cannot approve code that modifies pq_recvbuf() in the manner you
currently do.
Well. I understand you arguments.
But please also consider my argument above (about avoiding code
duplication).
In any case, secure_read function is called only from pq_recvbuf() as
well as pqsecure_read is called only from pqReadData.
So I do not think principle difference in handling compression in
secure_read or pq_recvbuf functions and do not understand why it is
"destroying the existing model".
Frankly speaking, I will really like to destroy existed model, moving
all system dependent stuff in Postgres to SAL and avoid this awful mix
of code
sharing and duplication between backend and frontend. But it is a
another story and I do not want to discuss it here.
If we are speaking about the "right design", then neither your
suggestion, neither my implementation are good and I do not see
principle differences between them.
The right approach is using "decorator pattern": this is how streams are
designed in .Net/Java. You can construct pipe of "secure", "compressed"
and whatever else streams.
Yes, it is first of all pattern for object-oriented approach and
Postgres is implemented in C. But it is actually possible to use OO
approach in pure C (X-Windows!).
But once again, this discussion may lead other too far away from the
topic of libpq compression.
As far as I already wrote, the mainĀ points of my design were:
1. Minimize changes in Postgres code
2. Avoid code duplication
3. Provide abstract (compression stream) which can be used somewhere
else except libpq itself.
My idea was the following: client want to use compression. But
server may reject this attempt (for any reasons: it doesn't support
it, has no proper compression library, do not want to spend CPU for
decompression,...) Right now compression algorithm is
hardcoded. But in future client and server may negotiate to choose
proper compression protocol. This is why I prefer to perform
negotiation between client and server to enable compression. Well,
for negotiation you could put the name of the algorithm you want in
the startup. It doesn't have to be a boolean for compression, and
then you don't need an additional round-trip.
Sorry, I can only repeat arguments I already mentioned:
- in future it may be possible to specify compression algorithm
- even with boolean compression option server may have some reasons to
reject client's request to use compression
Extra flexibility is always good thing if it doesn't cost too
much. And extra round of negotiation in case of enabling compression
seems to me not to be a high price for it.
You already have this flexibility even without negotiation. I don't
want you to lose your flexibility. Protocol looks like this:
- Client sends connection option "compression" with list of
algorithms it wants to use (comma-separated, or something).
- First packet that the server can compress one of those algorithms
(or none, if it doesn't want to turn on compression).
No additional round-trips needed.
This is exactly how it works now... Client includes compression
option in connection string and server replies with special message
('Z') if it accepts request to compress traffic between this client
and server.
No, it's not. You don't need this message. If the server receives a
compression request, it should just turn compression on (or not), and
then have the client figure out whether it got compression back.
How it will managed to do it. It receives some reply and first of all it
should know whether it has to be decompressed or not.
This
is of course made harder by your refusal to use packet framing, but
still shouldn't be particularly difficult.
But how?
Thanks,
--Robbie
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company