i don't think i've explained things very clearly. the implied contradiction is 
that i'd be using asynchronous replication to catch up a slave after a slave 
failure and thus i'm losing the transactional consistency that i suggest i 
need.  if a slave fails and is brought back on line i am indeed proposing that 
it catch up with the master asynchronously; however,  the slave wouldn't be 
promoted to a hot standby until it is completely caught up and could be 
reestablished as a synchronous replica (at least that is what i'd like to do in 
theory). so i'm proposing that a slave would never be a candidate for a HA 
failover unless it is completely in sync with a master: if there is no slave 
that is in sync with the master at the time the master fails, then the master 
would have to be recovered from the filesystem via traditional recovery. the 
fact that i envision 'catching up' a slave to a master using asychronous 
replication is not particularly relevant to
 the transactional guarantees of the system as a whole if the slave is 
effectively unavailable while catching up.

similarly, any slave that isn't caught up to its master would also not be 
eligible for queries.

i can understand why the master might hang when there is no reachable replica 
during synchronous commit, this is exactly the right thing to do if you want to 
guarantee that you have at least 2 distinct spheres of durability. but i'd 
prefer to sacrifice the extra durability guarantee in favor of availability in 
this case given that recovery from the file system is still an option should 
the master subsequently fail. my availability issue is that the master would 
clearly be hung/unavailable for an unbounded amount of time without a strong 
guarantee about the time it might take to bring a replica back up which is not 
acceptable in my case. 

if the master hangs commits because there is no active slave, i believe that an 
administrator would have to
        1. detect that there are no active slaves
        2. shut the master down
        3. disable synchronous replication
        4. bring the master back up
or, alternatively:
        1. detect that there are no active slaves
        2. interrupt any connections that are blocking on commit
        3. set synchronous_replication = local or off on all connections, 
effectively disabling synchronous replication going forwardbut i'd prefer 
something more automated approach that wouldn't be perceived as an outage to 
the client.


i envision some kind of time out after which the slave is removed from the 
master's synchronous replica set. and of course i'd need to work out the 
mechanics of bringing the slave back up to sync with the master and adding it 
back to the replica set, which would clearly require some additional machinery.

i hope that clears it up.

thanks.


________________________________
 From: Adrian Klaver <adrian.kla...@gmail.com>
To: Jameison Martin <jameis...@yahoo.com> 
Cc: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org> 
Sent: Tuesday, February 28, 2012 7:32 AM
Subject: Re: [GENERAL] synchronous replication: blocking commit on the master
 
On Monday, February 27, 2012 10:21:24 pm Jameison Martin wrote:
> I have specific needs for wanting synchronous replication instead of
> asynchronous replication, notwithstanding my desire to continue processing
> work on the master if there are no active slaves. I would like to use
>  replication for both HA and for query scaling. I'd like replication to be
> synchronous to ensure that any slaves are up to date, and I cannot afford
> even the small data potential loss implied by asynchronous replication.
>  However, should there be a situation where no slaves are alive (e.g.
> there is a single slave and it fails for whatever reason), I do not want
> to compromise the availability of the master while the slave is being
> restored. Instead, I'd like to be able to continue processing transactions
> on the master unimpeded until a slave can be brought back online. Once a
> slave is caught back up to the master I'd like to switch back to
> synchronous replication and again be able to use the slave to scale reads
> and as a failover target should the master fail.
> 
> Does that make sense?

No not really:)

The two statements below seem to be at odds with each other:

"I'd like replication to be synchronous to ensure that any slaves are up to 
date, and I cannot afford even the small data potential loss implied by 
asynchronous replication."

"Instead, I'd like to be able to continue processing transactions  on the 
master 
unimpeded until a slave can be brought back online."

It seems you want async sync replication and, under the observation that a 
chain 
is only as strong as its weakest link, you are really getting async 
replication.  
That being said, it is your set up and you have the options to have it run the 
way you want.



-- 
Adrian Klaver
adrian.kla...@gmail.com

Reply via email to