On Tue, Jun 15, 2010 at 3:57 PM, Josh Berkus <j...@agliodbs.com> wrote: >> I wonder if it would be possible to jigger things so that we send the >> WAL to the standby as soon as it is generated, but somehow arrange >> things so that the standby knows the last location that the master has >> fsync'd and never applies beyond that point. > > I can't think of any way which would not require major engineering. And > you'd be slowing down replication *in general* to deal with a fairly > unlikely corner case. > > I think the panic is the way to go.
I have yet to convince myself of how likely this is to occur. I tried to reproduce this issue by crashing the database, but I think in 9.0 you need an actual operating system crash to cause this problem, and I haven't yet set up an environment in which I can repeatedly crash the OS. I believe, though, that in 9.1, we're going to want to stream from WAL buffers as proposed in the patch that started out this thread, and then I think this issue can be triggered with just a database crash. In 9.0, I think we can fix this problem by (1) only streaming WAL that has been fsync'd and (2) PANIC-ing if the problem occurs anyway. But in 9.1, with sync rep and the performance demands that entails, I think that we're going to need to rethink it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers