On Sep 14, 2021, at 1:42 AM, Tomas Pospisek <t...@sourcepole.ch> wrote: > > I'm potentiall facing the same problem and would be interested in the > solution. Is there any particular howto you followed?
I used https://hevodata.com/learn/postgresql-logical-replication/ <https://hevodata.com/learn/postgresql-logical-replication/>, but I rather suspect there are better options, as I had a number of issues with that one. Essentially, it seems to boil down to the following steps: 1) Set up a new “slave” database. I used a pg_dump -s command piped through to psql to transfer schema only to the new DB 2) Configure both master and slave postgresql servers with wal_level=logical in the postgresql.conf file. Will require a server restart, which stinks if you need absolutely 0 downtime, but ideally should only take a few seconds. 3) On the master server, run the SQL command CREATE PUBLICATION publication_name FOR ALL TABLES; Presumably this should be issued while in the database you wish to replicate, and you can specify specific tables only if desired (though I haven’t looked into how). There is a note in that how-to about the published table needing a REPLICA IDENTITY to replicate DELETE and UPDATE operations, I still need to look into that more. For my case, however, I have an insert-only workflow, so it wasn’t an issue. 4) Finally, on the slave server, run the command CREATE SUBSCRIPTION subscription_name CONNECTION '...<connection string> ...' PUBLICATION my_publication; where connection string is something like: ‘host=my.remote.host port=5432 dbname=my_big_db' ...And that’s it, replication should kick off by syncing up the data from master to slave, and after that changes should be sent in real time. One gotcha that I ran into: my table used the postgis extension, which was set up on the slave DB when I ran the pg_dump commands. Creating that extension creates *and populates* a table named “spatial_ref_sys”. Since I had created the publication as “FOR ALL TABLES”, the replication tried to include this table. Which failed due to UNIQUE constraint violations. At which point replication stopped AS DID ALL INSERTS INTO THE MASTER DB! I deleted all records from said table on the slave DB, replication kicked in and re-populated it, and life went on, but I lost something like 14 hours of data thanks to this mistake. Guess that’s what I get for not keeping an eye on the process :-( Lesson learned: make sure any tables to be replicated are empty in the slave before issuing the CREATE SUBSCRIPTION command. For cutover, I plan the following, we’ll see how it goes: 1) move all read-only processes over to the new server. That can be done at any point after the above, with little or no downtime depending on your processes. 2) Reconfigure writing process to point to the slave, and stop them 3) Issue a DROP SUBSCRIPTION subscription_name on the slave, thereby “promoting” it to master 4) Restart writing processes, now pointed to the new master Steps 2-4 should happen in quick succession, resulting in only seconds of downtime. At least, that’s the theory. We’ll see how it goes (or if anyone else here can offer a better procedure!) --- Israel Brewster Software Engineer Alaska Volcano Observatory Geophysical Institute - UAF 2156 Koyukuk Drive Fairbanks AK 99775-7320 Work: 907-474-5172 cell: 907-328-9145 > > Also at some point I'd like to cut of the link between the two DBs promote > the copy to be the master and delete the original DB. Have you figured out > the correct step for the cut-over to happen? > *t > > On 13.09.21 23:10, Israel Brewster wrote: >> Ok, I have logical replication up-and-running (I guess - seemed to simple to >> be working. Shouldn’t it be complicated, requiring many steps and >> configuration changes?), maxing out one CPU core on each machine (more or >> less), and showing network throughput of around 15M. If DU changes are to be >> believed, it’s transferring data at about 0.8GB/minute, implying something >> like a 8 hour transfer time. >> Of course, since it is replication, it has the benefit that any data that >> comes in during that 8 hour window should also be replicated, after which >> the two systems should remain in sync allowing for zero (or nearly so) >> downtime cutover. Which is nice. >> Any gotchas I need to be aware of during this initial transfer window, such >> as WAL files building up on the source machine? >> --- >> Israel Brewster >> Software Engineer >> Alaska Volcano Observatory >> Geophysical Institute - UAF >> 2156 Koyukuk Drive >> Fairbanks AK 99775-7320 >> Work: 907-474-5172 >> cell: 907-328-9145 >>> On Sep 13, 2021, at 10:10 AM, Michael Lewis <mle...@entrata.com >>> <mailto:mle...@entrata.com> <mailto:mle...@entrata.com >>> <mailto:mle...@entrata.com>>> wrote: >>> >>> What version of Postgres is the source? Can you make use of logical >>> replication?