On Thu, Dec 13, 2018 at 12:17 AM Michael Paquier <mich...@paquier.xyz> wrote: > On Wed, Dec 12, 2018 at 07:54:05AM -0500, David Steele wrote: > > The LSN switch point is often the same even when servers are going to > > different timelines. If the LSN is different enough then the problem > > solves itself since the .partial will be on an entirely different > > segment. > > That would mean that WAL forked exactly at the same record. You have > likely seen more cases where than can happen in real life than I do.
Suppose that the original master fails during an idle period, and we promote a slave. But we accidentally promote a slave that can't serve as the new master, like because it's in a datacenter with an unreliable network connection or one which is about to be engulfed in lava. So, we go to promote a different slave, and because we never got around to reconfiguring the standbys to follow the previous promotion, kaboom. Or, suppose we do PITR to recover from some user error, but then somebody screws up the contents of the recovered cluster and we have to do it over again. Naturally we'll recover to the same point. The new TLI is the only thing that is guaranteed to be unique with each new promotion, and I would guess that it is therefore the right thing to use to disambiguate them. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company