Re: [HACKERS] Synchronization levels in SR

Robert Haas Wed, 26 May 2010 10:42:20 -0700

On Wed, May 26, 2010 at 1:24 PM, Heikki Linnakangas
<[email protected]> wrote:
> On 26/05/10 20:10, Kevin Grittner wrote:
>>
>> Heikki Linnakangas<[email protected]>  wrote:
>>
>>> One way to do that would be to refrain from flushing the commit
>>> record to disk on the master until the standby has acknowledged
>>> it.
>>
>> I'm not clear on the benefit of doing that, versus flushing the
>> commit record and then waiting for responses.  Either way some
>> databases will commit before others -- what is the benefit of having
>> the master lag?
>
> Hmm, I was going to answer that that way no other transactions can see the
> transaction as committed before it has been safely replicated, but I now
> realize that you could also flush, but refrain from releasing the entry from
> procarray until the standby acknowledges the commit, so the transaction
> would look like in-progress to other transactions in the master until that.
>
> Although, if the master crashes at that point, and quickly recovers, you
> could see the last transactions committed on the master before they're
> replicated to the standby.


No matter what you do, there's going to be corner cases where one node
thinks the transaction committed and the other node doesn't know.  At
any given time, we're either in a state where a crash and restart on
the master will replay the commit record, or we're not.  And also, but
somewhat independently, we're in a state where a crash on the standby
will replay the commit record, or we're not.  Each of these is
dependent on a disk write, and there's no way to guarantee that both
of those disk writes succeed or both of them fail.

Now, in theory, maybe you could have a system where we don't have a
fixed definition of who the master is.  If either server crashes or if
they lose communication, both crash.  If they both come back up, they
agree on who has the higher LSN on disk and both roll forward to that
point, then designate one server to be the master.  If one comes back
up and can't reach the other, it appeals to the clusterware for help.
The clusterware is then responsible for shooting one node in the head
and telling the other node to carry on as the sole survivor.  When,
eventually, the dead node is resurrected, it *discards* any WAL
written after the point from which the new master restarted.

Short of that, I don't think "abort the transaction" is a recovery
mechanism for when we can't get hold of a standby.  We're going to
have to commit locally first and then we can decide how long to wait
for an ACK that a standby has also committed the same transaction
remotely.  We can wait not at all, forever, or for a while and then
declare the other guy dead.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Synchronization levels in SR

Reply via email to