(changed subject again.)

On 17/09/10 10:06, Simon Riggs wrote:
I don't think we can determine how far to implement without considering
both approaches in detail. With regard to your points below, I don't
think any of those points could be committed first.

Yeah, I think we need to decide on the desired feature set first, before we dig deeper into the the patches. The design and implementation will fall out of that.

That said, there's a few small things that can be progressed regardless of the details of synchronous replication. There's the changes to trigger failover with a signal, and it seems that we'll need some libpq changes to allow acknowledgments to be sent back to the master regardless of the rest of the design. We can discuss those in separate threads in parallel.

So the big question is what the user interface looks like. How does one configure synchronous replication, and what options are available. Here's a list of features that have been discussed. We don't necessarily need all of them in the first phase, but let's avoid painting ourselves in the corner.

* Support multiple standbys with various synchronization levels.

* What happens if a synchronous standby isn't connected at the moment? Return immediately vs. wait forever.

* Per-transaction control. Some transactions are important, others are not.

* Quorum commit. Wait until n standbys acknowledge. n=1 and n=all servers can be seen as important special cases of this.

* async, recv, fsync and replay levels of synchronization.

So what should the user interface be like? Given the 1st and 2nd requirement, we need standby registration. If some standbys are important and others are not, the master needs to distinguish between them to be able to determine that a transaction is safely delivered to the important standbys.

For per-transaction control, ISTM it would be enough to have a simple user-settable GUC like synchronous_commit. Let's call it "synchronous_replication_commit" for now. For non-critical transactions, you can turn it off. That's very simple for developers to understand and use. I don't think we need more fine-grained control than that at transaction level, in all the use cases I can think of you have a stream of important transactions, mixed with non-important ones like log messages that you want to finish fast in a best-effort fashion. I'm actually tempted to tie that to the existing synchronous_commit GUC, the use case seems exactly the same.

OTOH, if we do want fine-grained per-transaction control, a simple boolean or even an enum GUC doesn't really cut it. For truly fine-grained control you want to be able to specify exceptions like "wait until this is replayed in slave named 'reporting'" or 'don't wait for acknowledgment from slave named 'uk-server'". With standby registration, we can invent a syntax for specifying overriding rules in the transaction. Something like SET replication_exceptions = 'reporting=replay, uk-server=async'.

For the control between async/recv/fsync/replay, I like to think in terms of
a) asynchronous vs synchronous
b) if it's synchronous, how synchronous is it? recv, fsync or replay?

I think it makes most sense to set sync vs. async in the master, and the level of synchronicity in the slave. Although I have sympathy for the argument that it's simpler if you configure it all from the master side as well.

Putting all of that together. I think Fujii-san's standby.conf is pretty close. What it needs is the additional GUC for transaction-level control.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to