On Wed, Sep 2, 2015 at 09:03:25PM -0400, Robert Haas wrote: > > Can you explain why logical replication is better than binary > > replication for this use-case? > > Uh, well, for the same reasons it is better in many other cases. > Particularly, you probably don't want to replicate all the data on > machine A to machine B, just some of it. > > Typically, sharding solutions store multiple copies of each piece of > data. So let's say you have 4 machines. You divide the data into 12 > chunks. Each machine is the write-master for 2 of those chunks, but > has secondary copies of 3 others. So maybe things start out like > this: > > machine #1: master for chunks 1, 2, 3; also has copies of chunks 4, 7, 10 > machine #2: master for chunks 4, 5, 6; also has copies of chunks 1, 8, 11 > machine #3: master for chunks 7, 8, 9; also has copies of chunks 2, 5, 12 > machine #4: master for chunks 10, 11, 12; also has copies of chunks 3, 6, 9 > > If machine #1 is run over by a rabid triceratops, you can make machine > #2 the master for chunk 1, machine #3 the master for chunk 2, and > machine #4 the master for chunk 3. The write load therefore remains > evenly divided. If you can only copy entire machines, you can't > achieve that in this situation.
I see the advantage of this now. My original idea is that each shard would have its own standby for disaster recovery, but your approach above, which I know is typical, allows the shards to back up each other. You could say shard 2 is the backup for shard 1, but then if shard one goes bad, the entire workload of shard 1 goes to shard 2. With the above approach, the load of shard 1 is shared by all the shards. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers