Re: [HACKERS] Issues with Quorum Commit

Bruce Momjian Wed, 20 Oct 2010 17:49:49 -0700

Tom Lane wrote:
> Greg Smith <g...@2ndquadrant.com> writes:
> > I don't see this as needing any implementation any more complicated than 
> > the usual way such timeouts are handled.  Note how long you've been 
> > trying to reach the standby.  Default to -1 for forever.  And if you hit 
> > the timeout, mark the standby as degraded and force them to do a proper 
> > resync when they disconnect.  Once that's done, then they can re-enter 
> > sync rep mode again, via the same process a new node would have done so.
> 
> Well, actually, that's *considerably* more complicated than just a
> timeout.  How are you going to "mark the standby as degraded"?  The
> standby can't keep that information, because it's not even connected
> when the master makes the decision.  ISTM that this requires
> 
> 1. a unique identifier for each standby (not just role names that
> multiple standbys might share);
> 
> 2. state on the master associated with each possible standby -- not just
> the ones currently connected.
> 
> Both of those are perhaps possible, but the sense I have of the
> discussion is that people want to avoid them.
> 
> Actually, #2 seems rather difficult even if you want it.  Presumably
> you'd like to keep that state in reliable storage, so it survives master
> crashes.  But how you gonna commit a change to that state, if you just
> lost every standby (suppose master's ethernet cable got unplugged)?
> Looks to me like it has to be reliable non-replicated storage.  Leaving
> aside the question of how reliable it can really be if not replicated,
> it's still the case that we have noplace to put such information given
> the WAL-is-across-the-whole-cluster design.


I assumed we would have a parameter called "sync_rep_failure" that would
take a command and the command would be called when communication to the
slave was lost.  If you restart, it tries again and might call the
function again.

-- 
  Bruce Momjian  <br...@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Issues with Quorum Commit

Reply via email to