[HACKERS] Issues with two-server Synch Rep

Josh Berkus Thu, 07 Oct 2010 11:06:29 -0700

Simon, Fujii,

What follows are what I see as the major issues with making two-server
synch replication work well.  I would like to have you each answer them,
explaining how your patch and your design addresses each issue.  I
believe this will go a long way towards helping the majority of the
community understand the options we have from your code, as well as
where help is still needed.


Adding a Synch Standby
-----------------------
What is the procedure for adding a new synchronous standby in your
implementation?  That is, how do we go from having a standby server with
an empty PGDATA to having a working synchronous standby?

Snapshot Publication
---------------------
During 9.0 development discussion, one of the things we realized we
needed for synch standby was publication of snapshots back to the master
in order to prevent query cancel on the standby.  Without this, the
synch standby is useless for running read queries.  Does your patch
implement this?  Please describe.

Management
-----------
One of the serious flaws currently in HS/SR is complexity of
administration.  Setting up and configuring even a single master and
single standby requires editing up to 6 configuration files in Postgres,
as well as dealing with file permissions.  As such, any Synch Rep patch
must work together with attempts to simplify administration.  How does
your design do this?

Monitoring
-----------
Synch rep offers severe penalties to availability if a synch standby
gets behind or goes down.  What replication-specific monitoring tools
and hooks are available to allow administators to take action before the
database becomes unavailable?

Degradation
------------
In the event that the synch rep standby falls too far behind or becomes
unavailable, or is deliberately taken offline, what are you envisioning
as the process for the DBA resolving the situation?  Is there any
ability to commit "stuck" transactions?

Client Consistency
---------------------
With a standby in "apply" mode, and a master failure at the wrong time,
there is the possibility that the Standby will apply a transaction at
the same time that the master crashes, causing the client to never
receive a commit message.  Once the client reconnects to the standby,
how will it know whether its transaction was committed or not?

As a lesser case, a standby in "apply" mode will show the results of
committed transactions *before* they are visible on the master.  Is
there any need to handle this?  If so, how?

Performance
------------
As with XA, synch rep has the potential to be so slow as to be unusable.
 What optimizations to you make in your approach to synch rep to make it
faster than two-phase commit?  What other performance optimizations have
you added?

-- 
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Issues with two-server Synch Rep

Reply via email to