On Wed, Sep 2, 2015 at 6:56 PM, Bruce Momjian <br...@momjian.us> wrote: > On Wed, Sep 2, 2015 at 02:41:46PM -0400, Robert Haas wrote: >> 4. Therefore, I think that we should instead use logical replication, >> which might be either synchronous or asynchronous. When you modify >> one copy of the data, that change will then be replicated to all other >> nodes. If you are OK with eventual consistency, this replication can >> be asynchronous, and nodes that are off-line will catch up when they >> are on-line. If you are not OK with that, then you must replicate >> synchronously to every node before transaction commit; or at least you >> must replicate synchronously to every node that is currently on-line. >> This presents some challenges: logical decoding currently can't >> replicate transactions that are still in process - replication starts >> when the transaction commits. Also, we don't have any way for >> synchronous replication to wait for multiple nodes. But in theory >> those seem like limitations that can be lifted. Also, the GTM needs >> to be aware that this stuff is happening, or it will DTWT. That too >> seems like a problem that can be solved. > > Can you explain why logical replication is better than binary > replication for this use-case?
Uh, well, for the same reasons it is better in many other cases. Particularly, you probably don't want to replicate all the data on machine A to machine B, just some of it. Typically, sharding solutions store multiple copies of each piece of data. So let's say you have 4 machines. You divide the data into 12 chunks. Each machine is the write-master for 2 of those chunks, but has secondary copies of 3 others. So maybe things start out like this: machine #1: master for chunks 1, 2, 3; also has copies of chunks 4, 7, 10 machine #2: master for chunks 4, 5, 6; also has copies of chunks 1, 8, 11 machine #3: master for chunks 7, 8, 9; also has copies of chunks 2, 5, 12 machine #4: master for chunks 10, 11, 12; also has copies of chunks 3, 6, 9 If machine #1 is run over by a rabid triceratops, you can make machine #2 the master for chunk 1, machine #3 the master for chunk 2, and machine #4 the master for chunk 3. The write load therefore remains evenly divided. If you can only copy entire machines, you can't achieve that in this situation. I'm not saying that the above is exactly what we're going to end up with, or even necessarily close. But a big part of the point of sharding is that not all the machines have the same data - otherwise you are not write scaling. But it will frequently be the case, for various reasons, that they have *overlapping* sets of data. Logical replication can handle that; physical replication can't. In Postgres-XC, all tables are either sharded (part of the table is present on each node) or distributed (all of the table is present on every node). Clearly, there's no way to use physical replication in that scenario except if you are OK with having two copies of every node. But that's not a very good solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers