On 2017-06-02 10:05:21 +0900, Michael Paquier wrote: > On Fri, Jun 2, 2017 at 9:29 AM, Andres Freund <and...@anarazel.de> wrote: > > On 2017-06-02 08:38:51 +0900, Michael Paquier wrote: > >> On Fri, Jun 2, 2017 at 7:05 AM, Andres Freund <and...@anarazel.de> wrote: > >> > I'm a unhappy how this is reusing SIGINT for WalSndLastCycleHandler. > >> > Normally INT is used cancel interrupts, and since walsender is now also > >> > working as a normal backend, this overlap is bad. Even for plain > >> > walsender backend this seems bad, because now non-superusers replication > >> > users will terminate replication connections when they do > >> > pg_cancel_backend(). For replication=dbname users it's especially bad > >> > because there can be completely legitimate uses of pg_cancel_backend(). > >> > >> Signals for WAL senders are set in WalSndSignals() which uses SIG_IGN > >> for SIGINT now in ~9.6, and StatementCancelHandler does not get set up > >> for a non-am_walsender backend. Am I missing something? > > > > Yes, but nothing in those observeration actually addresses my point? > > I am still confused by your previous email, which, at least it seems > to me, implies that logical WAL senders have been working correctly > with query cancellations. Now SIGINT is just ignored, which means that > pg_cancel_backend() has never worked for WAL senders until now, and > this behavior has not changed with 086221c. So there is no new > breakage introduced by this commit. I get your point to reuse SIGINT > for query cancellations though, but that's a new feature.
The issue is that the commit made a non-existant feature (pg_cancel_backend() to walsenders) into a broken one (pg_cancel_backend terminates walsenders). Additionally v10 added something new (walsenders executing SQL), and that will need at least some signal handling fixes - hard to do if e.g. SIGINT is reused for something else. > > a) upon shutdown checkpointer (so we can use procsignal), not > > postmaster, sends PROCSIG_WALSND_INIT_STOPPING to all walsenders (so > > we don't have to use up a normal signal handler) > > You'll need a second one that wakes up the latch of the WAL senders to > send more WAL records. Don't think so, procsignal_sigusr1_handler serves quite well for that. There's nearby discussion that we need to do so anyway, to fix recovery conflict interrupts, parallelism interrupts and such. > > b) Upon reception walsenders immediately exit if !replication_active, > > otherwise sets got_STOPPING > > Okay, that's what happens now anyway, any new replication command > received results in an error. I actually prefer the way of doing in > HEAD, which at least reports an error. Err, no. What happens right now is that plainly nothing is done if a connection is idle or busy executing things. Only if a new command is sent we error out - that makes very little sense. > > c) Logical decoding will finish if got_STOPPING is sent, *WITHOUT*, as > > currently done, moving to WALSNDSTATE_STOPPING. Not yet quite sure > > how to best handle the WALSNDSTATE_STOPPING transition in WalSndLoop(). > > Wouldn't it make sense to have the logical receivers be able to > receive WAL up to the end of checkpoint record? Yea, that's what I'm doing. For that we really only need to change the check in WalSndWaitForWal() check of got_SIGINT to got_STOPPING, and add a XLogSendLogical() check in the WalSndCaughtUp if() that sets got_SIGUSR2 *without* setting WALSNDSTATE_STOPPING (otherwise we'd possibly continue to trigger wal records until the send buffer is emptied). - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers