Hi, Am Dienstag, den 12.09.2017, 08:53 -0400 schrieb Peter Eisentraut: > On 9/11/17 03:11, Michael Banck wrote: > > > Is there a race condition here? The slot is created after the checkpoint > > > is completed. But you have to start streaming from the LSN where the > > > checkpoint started, so shouldn't the slot be created before the checkpoint > > > is started? > > > > So my patch only moves the slot creation slightly further forward, > > AFAICT. > > > > AIUI, wal streaming always begins at last checkpoint and from my tests > > the restart_lsn of the created replication slot is also before that > > checkpoint's lsn. However, I hope somebody more familiar with the > > WAL/replication slot code could comment on that. What I dropped in the > > refactoring is the RESERVE_WAL that used to be there when the temporary > > slot gets created, I have readded that now. > > Maybe there is an argument to be made here about whether this is correct > or not, but why bother and risk the fragility? Why not create the > replication slot first thing. I would put it after the server version > checks and before we write recovery.conf.
The replication slots are created via the replication protocol through the second background connection that is used for WAL streaming in StartLogStreamer(). By their nature temporary replication slots must be created by that WAL streamer using them and cannot be created by the main connection which initiates the snapshot (or else you get a "replication slot "pg_basebackup_XXX" is active for PID XXX" error in the WAL streamer). So ISTM we cannot rip out CreateReplicationSlot() (or the manual CREATE_REPLICATION_SLOT that is currently in master) from StartLogStreamer() at least for temporary slots. We could split up the logic here and create the optional physical replication slot in the main connection and the temporary one in the WAL streamer connection, but this would keep any fragility around for (likely more frequently used) temporary replication slots. It would make the patch much smaller though if I revert touching temporary slots at all. The alternative would be to call StartLogStreamer() earlier, but it requires xlogstart as argument so we cannot move its call before the checkpoint is taken and xlogstart is determined, the earliest I managed was when starttli is determined, which is just few instructions earlier than now. Thoughts? Michael -- Michael Banck Projektleiter / Senior Berater Tel.: +49 2166 9901-171 Fax: +49 2166 9901-100 Email: michael.ba...@credativ.de credativ GmbH, HRB Mönchengladbach 12080 USt-ID-Nummer: DE204566209 Trompeterallee 108, 41189 Mönchengladbach Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers