Re: libpq copy error handling busted

Tom Lane Sat, 06 Jun 2020 09:16:20 -0700

Andres Freund <and...@anarazel.de> writes:
> On 2020-06-03 18:41:28 -0400, Tom Lane wrote:
>> * pqSendSome() is responsible not only for pushing out data, but for
>> calling pqReadData in any situation where it can't get rid of the data
>> promptly.  1f39a1c06 overlooked that requirement, and the upshot is
>> that we don't necessarily notice that the connection is broken (it's
>> pqReadData's job to detect that).  Putting a pqReadData call into
>> the early-exit path helps, but doesn't fix things completely.


> Is that fully necessary? Couldn't we handle at least the case I had by
> looking at write_failed in additional places?

No doubt there's more than one way to do it, but I like fixing this in
pqSendSome; that's adding symmetry not warts.  It's already the case
that pqSendSome must absorb input when it's transiently unable to send
(that has to be true to avoid livelock when TCP buffers are full in both
directions).  So making it absorb input when it's permanently unable
to send seems consistent with that.  Also, fixing this at outer levels
would make it likely that we're not covering as many cases; which was
essentially the point of 1f39a1c06.

>> I think that the idea was to let the client dump all its copy data and
>> then report the error message when PQendCopy is called, but as you
>> say, that's none too friendly when there's gigabytes of data involved.
>> I'm not sure we can do anything about this without effectively
>> changing the client API for copy errors, though.

> Hm. Why would it *really* be an API change?

It'd still conform to the letter of the documentation, sure, but it'd
nonetheless be a user-visible behavioral change.

It strikes me that we could instead have the COPY code path "peek"
to see if an 'E' message is waiting in the inBuffer, without actually
consuming it, and start failing PQputCopyData calls if so.  That would
be less of a behavioral change than consuming the message, in the sense
that the error would still be available to be reported when PQendcopy is
called.

On the other hand, that approach assumes that the application will
indeed call PQendcopy to see what's up, rather than just printing some
unhelpful "copy failed" message and going belly up.  pgbench is, um, a
counterexample.  If we suppose that pgbench is representative of the
state of the art in applications, then we'd be better off consuming the
error message and reporting it via the notice mechanism.  Which would
effectively mean that the user gets to see it and the application
doesn't.  On the whole I don't like that, but if we do it the first way
then there might be a lot of apps that need upgrades to handle COPY
failures nicely.  (And on the third hand, those apps *already* need
upgrades to handle COPY failures nicely, plus right now you have to
wait till the end of the COPY.  So anything would be an improvement.)

> But given that this
> hasn't been complained about much in however many years...

Yeah, it's kind of hard to summon the will to break things when
there aren't user complaints.  You can bet that somebody will
complain if we change this, in either direction.

                        regards, tom lane

Re: libpq copy error handling busted

Reply via email to