On 2017-04-17 18:28:16 +0200, Petr Jelinek wrote: > On 17/04/17 18:02, Andres Freund wrote: > > On 2017-04-15 02:33:59 +0900, Fujii Masao wrote: > >> On Fri, Apr 14, 2017 at 10:33 PM, Petr Jelinek > >> <petr.jeli...@2ndquadrant.com> wrote: > >>> On 12/04/17 15:55, Fujii Masao wrote: > >>>> Hi, > >>>> > >>>> When I shut down the publisher while I repeated creating and dropping > >>>> the subscription in the subscriber, the publisher emitted the following > >>>> PANIC error during shutdown checkpoint. > >>>> > >>>> PANIC: concurrent transaction log activity while database system is > >>>> shutting down > >>>> > >>>> The cause of this problem is that walsender for logical replication can > >>>> generate WAL records even during shutdown checkpoint. > >>>> > >>>> Firstly walsender keeps running until shutdown checkpoint finishes > >>>> so that all the WAL including shutdown checkpoint record can be > >>>> replicated to the standby. This was safe because previously walsender > >>>> could not generate WAL records. However this assumption became > >>>> invalid because of logical replication. That is, currenty walsender for > >>>> logical replication can generate WAL records, for example, by executing > >>>> CREATE_REPLICATION_SLOT command. This is an oversight in > >>>> logical replication patch, I think. > >>> > >>> Hmm, but CREATE_REPLICATION_SLOT should not generate WAL afaik. I agree > >>> that the issue with walsender still exist (since we now allow normal SQL > >>> to run there) but I think it's important to identify what exactly causes > >>> the WAL activity in your case > >> > >> At least in my case, the following CREATE_REPLICATION_SLOT command > >> generated WAL record. > >> > >> BEGIN READ ONLY ISOLATION LEVEL REPEATABLE READ; > >> CREATE_REPLICATION_SLOT testslot TEMPORARY LOGICAL pgoutput > >> USE_SNAPSHOT; > >> > >> Here is the pg_waldump output of the WAL record that > >> CREATE_REPLICATION_SLOT > >> generated. > >> > >> rmgr: Standby len (rec/tot): 24/ 50, tx: 0, > >> lsn: 0/01601438, prev 0/01601400, desc: RUNNING_XACTS nextXid 692 > >> latestCompletedXid 691 oldestRunningXid 692 > >> > >> So I guess that CREATE_REPLICATION_SLOT code calls LogStandbySnapshot() > >> and which generates WAL record about snapshot of running transactions. > > > > Erroring out in these cases sounds easy enough. Wonder if there's not a > > bigger problem with WAL records generated e.g. by HOT pruning or such, > > during decoding. Not super likely, but would probably hit exactly the > > same, no? > > > > Sounds possible, yes. Sounds like that's going to be nontrivial to fix > though. > > Another problem is that queries can run on walsender now. But that > should be possible to detect and shutdown just like backend.
This sounds like a case for s/PANIC/ERROR|FATAL/ to me... -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers