Hi, Thanks for the detailed explanations, I've definitely misinterpreted how interrupts and errors were handled.
On Fri, Jan 17, 2025 at 7:03 PM Andres Freund <and...@anarazel.de> wrote: > > Might be worth using it it in src/test/postmaster/t/002_start_stop.pl? That > has e.g. code to send a startup message. I've changed pgproto to provide the necessary helpers to be used in 001_connection_limits.pl and 002_start_stop.pl. > > Playing around with this it's unfortunately worse - we very commonly get to > > ProcessClientWriteInterrupt() with InterruptHoldoffCount > 0. Including just > > after processing recovery conflict interrupts. The reason for that is that > > if > > we do throw an error the first thing we do is to hold interrupts > > (c.f. sigsetjmp() in PostgresMain()) and then we call EmitErrorReport(). > > > > Unfortunately I suspect that this means any protection we'd get from this > > version of the patch is rather incomplete - if the recovery conflict is > > triggered while not blocked, we'll ERROR out and report the message with > > interrupts held. That cycle of recovery conflict signalling wouldn't ever be > > resolved in such a case, I think. Of course we do re-signal recovery > > conflict > > interrupts rather aggressively - but I'm not sure that's going to help > > reliably. > > This unfortunately indeed is true. If I, instead of the generate_series(), > add the following to the pgproto query: > DO \$\$BEGIN RAISE 'endless scream: %', repeat('a', > 1000000); END;\$\$; > > the conflict doesn't get handled. A possible approach to handle this case could be to stop retrying and return a write error if there's a recovery conflict on a blocked write, with ProcessClientWriteInterrupt deciding whether a write is retryable or not. WIthin ProcessClientWriteInterrupt, if we conflict with the recovery on a blocked write and interrupts can be processed, then we can call ProcessRecoveryConflictInterrupts with ExitOnAnyError. Otherwise, if the interrupts can't be processed, then we make the write not retryable, leading to a write error which will close the socket with a "connection to client lost". This requires a new CheckBlockedWriteConflictInterrupts function that validates whether the process conflicts or not without processing the conflict as we don't want to abort a writing if the conflict could be ignored. The code is a bit rough as there's duplicated logic checks between CheckBlockedWriteConflictInterrupts and ProcessRecoveryConflictInterrupt that could very probably be simplified, but I wanted to validate the approach first. It would also be probably worth adding more tests to cover all conflicts while there's a blocked write, though I did add the RAISE case in the tests. Regards, Anthonin Bonnefoy
v04-0002-Accept-recovery-conflict-interrupt-on-blocked-wr.patch
Description: Binary data
v04-0001-Add-PgProto-test-module-to-send-message-on-a-raw.patch
Description: Binary data