Hi, Simon Riggs wrote: > On Sat, 2008-12-13 at 14:07 +0100, Markus Wanner wrote: >> Speaking of a "synchronous commit" >> is utterly misleading, because the commit itself is exactly the thing >> that's *not* synchronous. > > Not really sure where you're going here.
I'm pointing to a potential misunderstanding, trying to help to prevent you from running into the same issues and discussions as I did. I've learned the hard way, that the Postgres-R algorithm is not fully synchronous (in the strict sense). This caused confusion for people who take the word "synchronous" by its original meaning. The algorithm proposed here seems similar enough to potentially cause the same confusion. As I see it now, I think it's well worth to point out the difference, from both, the technical as well as from the marketing perspective. The former for better understanding, the later to prevent users from thinking it must be slow per definition. Arguing that your approach is not fully synchronous definitely helps defending that concern. However, I'm just now realizing, that the difference is only relevant as soon as you begin to allow read-only access on the slave. AFAIK that's among the goals of this effort, no? > "synchronous replication" is > used exactly as described in the Wikipedia entry here: > http://en.wikipedia.org/wiki/Database_replication That article describes pretty much all variants of replication, what exactly are you referring to? Under "Database Replication > Multi-Master replication" it describes eager vs lazy variants, which is IMO a more appropriate and useful distinction than sync vs async. (But that's admittedly a sentence I've contributed myself, IIRC). Under "Storage Replication > Synchronous Replication" one can read: "Write is not considered complete until acknowledgement by both local and remote storage." For the proposed approach this might hold true for WAL writing. However, the user certainly doesn't care how synchronous the log is shipped nor written, is as long as she doesn't see the changes on the slave. That's the difference between fully synchronous and eager (or virtually or approximately synchronous) algorithms. You seem to refer to both as "synchronous". Phrases like "synchronous commit" or "synchronous data transfer" do not help me to understand what exactly you are talking about. Explaining that the slave commits (and therefore makes the transactions visible) asynchronously would help. And it would prevent disappointment for users who expect changes to be immediately visible on the slave. > No two word phrase is going to accurately sum up the complexity and > potential for data loss in these situations. DRBD saw that too and just > called them A, B and C and then describe them more accurately. Agreed. I've chosen lazy, eager and sync, so far. I'm open for better terms, and I leave it up to you to call your variants whatever you like. But to understand what you are talking about, I'd prefer to get to know these distinctions crisp and clear. > But I don't think we should say "PostgreSQL just implemented algorithm > B" which is just unhelpful. I don't think its "marketing" to refer to it > by the phrase most commonly used for the technology we are building. I certainly agree to using such terms. Unfortunately, in my experience, synchronous replication is commonly used to mean that transactions are guaranteed to be immediately visible on remote nodes after the client got commit acknowledgment. That's the cause for confusion I'm envisioning. I'm hoping to be somewhat helpful to this effort of getting a log shipping replication variant into Postgres. It can only be beneficial for Postgres-R in that we gain field experience with ..uhm.. this special kind of replication, however we name it. I'm already on xmas vacation, so I won't bother you any further on this issue. Have fun coding and make sure to enjoy this time of the year. All the best. Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers