Hi,

Thanks for the detailed explanations, I've definitely misinterpreted
how interrupts and errors were handled.

On Fri, Jan 17, 2025 at 7:03 PM Andres Freund <and...@anarazel.de> wrote:
>
> Might be worth using it it in src/test/postmaster/t/002_start_stop.pl? That
> has e.g. code to send a startup message.

I've changed pgproto to provide the necessary helpers to be used in
001_connection_limits.pl and 002_start_stop.pl.

> > Playing around with this it's unfortunately worse - we very commonly get to
> > ProcessClientWriteInterrupt() with InterruptHoldoffCount > 0. Including just
> > after processing recovery conflict interrupts. The reason for that is that 
> > if
> > we do throw an error the first thing we do is to hold interrupts
> > (c.f. sigsetjmp() in PostgresMain()) and then we call EmitErrorReport().
> >
> > Unfortunately I suspect that this means any protection we'd get from this
> > version of the patch is rather incomplete - if the recovery conflict is
> > triggered while not blocked, we'll ERROR out and report the message with
> > interrupts held. That cycle of recovery conflict signalling wouldn't ever be
> > resolved in such a case, I think. Of course we do re-signal recovery 
> > conflict
> > interrupts rather aggressively - but I'm not sure that's going to help
> > reliably.
>
> This unfortunately indeed is true.  If I, instead of the generate_series(),
> add the following to the pgproto query:
>                         DO \$\$BEGIN RAISE 'endless scream: %', repeat('a', 
> 1000000); END;\$\$;
>
> the conflict doesn't get handled.

A possible approach to handle this case could be to stop retrying and
return a write error if there's a recovery conflict on a blocked
write, with ProcessClientWriteInterrupt deciding whether a write is
retryable or not. WIthin ProcessClientWriteInterrupt, if we conflict
with the recovery on a blocked write and interrupts can be processed,
then we can call ProcessRecoveryConflictInterrupts with
ExitOnAnyError. Otherwise, if the interrupts can't be processed, then
we make the write not retryable, leading to a write error which will
close the socket with a "connection to client lost".

This requires a new CheckBlockedWriteConflictInterrupts function that
validates whether the process conflicts or not without processing the
conflict as we don't want to abort a writing if the conflict could be
ignored.

The code is a bit rough as there's duplicated logic checks between
CheckBlockedWriteConflictInterrupts and
ProcessRecoveryConflictInterrupt that could very probably be
simplified, but I wanted to validate the approach first. It would also
be probably worth adding more tests to cover all conflicts while
there's a blocked write, though I did add the RAISE case in the tests.

Regards,
Anthonin Bonnefoy

Attachment: v04-0002-Accept-recovery-conflict-interrupt-on-blocked-wr.patch
Description: Binary data

Attachment: v04-0001-Add-PgProto-test-module-to-send-message-on-a-raw.patch
Description: Binary data

Reply via email to