On Tue, Dec 3, 2013 at 7:20 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Magnus Hagander <mag...@hagander.net> writes: > > On Tue, Dec 3, 2013 at 7:11 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > >> Maybe we should just bite the bullet and change the WAL format for > >> heap_freeze (inventing an all-new record type, not repurposing the old > >> one, and allowing WAL replay to continue to accept the old one). The > >> implication for users would be that they'd have to update slave servers > >> before the master when installing the update; which is unpleasant, but > >> better than living with a known data corruption case. > > > Agreed. It may suck, but it sucks less. > > > How badly will it break if they do the upgrade in the wrong order though. > > Will the slaves just stop (I assume this?) or is there a risk of a > > wrong-order upgrade causing extra breakage? > > I assume what would happen is the slave would PANIC upon seeing a WAL > record code it didn't recognize. Installing the updated version should > allow it to resume functioning. Would be good to test this, but if it > doesn't work like that, that'd be another bug to fix IMO. We've always > foreseen the possible need to do something like this, so it ought to > work reasonably cleanly. > > I wonder if we should for the future have the START_REPLICATION command (or the IDENTIFY_SYSTEM would probably make more sense - or even adding a new command like IDENTIFY_CLIENT. The point is, something in the replication protocol) have walreceiver include it's version sent to the master. That way we could have the walsender identify a walreceiver that's too old and disconnect it right away - with a much nicer error message than a PANIC. Right now, walreceiver knows the version of the walsender (through pqserverversion), but AFAICT there is no way for the walsender to know which version of the receiver is connected.
-- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/