Re: Logical replication performance

Flavio Henrique Araque Gurgel Fri, 29 Nov 2019 08:27:09 -0800

Em sex., 29 de nov. de 2019 às 17:06, Florian Philippon <
florian.philip...@doctolib.com> escreveu:


> Hello community!
>
> Hi Florian


> We are currently testing PostgreSQL 11's built-in logical replication. We
> are trying to initialize a subscriber (from scratch) from a publisher with
> a large database (+6TB) with around 220 tables.
>
> We tweaked the configuration parameters below, both on publisher and
> subscriber, in order to minimize the initial copy data phase delay:
>
> - max_replication_slots
> - max_wal_senders
> - max_wal_size
> - max_worker_processes
> - max_logical_replication_workers
> - max_sync_workers_per_subscription
> - max_worker_processes
>
> The two PostgreSQL instances are using the same hardware: 48 vCPU, 384 GB
> ram, 10GB network and same version of software (PostgreSQL 11.6).
>
> We pre-loaded the full schema of the database (with indexes and
> constraints) on the subscriber since it's mandatory to have the logical
> replication working.
>
> However, the initial copy data phase is quite long (+2 days and still
> running) for largest tables in the database. There is no load on the
> publisher since it's a staging environment.
> We noticed that logical replication workers processes on the subscriber
> can reach more than 90% CPU usage per worker.
>
> We understand that we cannot have more than one worker per table running
> but we would like to know if there is anything that could help us to
> achieve this initial copy phase more quickly.
>
> We tried another solution: we loaded a minimal schema (without indexes and
> constraints) on the subscriber and created the subscription. The initial
> copy phase was way faster (a few hours). Then we created indexes and
> constraints. Is this a suitable solution for production? Will the logical
> replication flow be buffered by the replication slots during index creation
> and get in sync afterwards or will it conflict due to locking issues?
>
>
You can try the pg_dump over a snapshot and use parallel restore
(pg_restore -j option) to your initial data load, it should be much faster
than an initial sync. Take a look here:
https://www.postgresql.org/docs/11/logicaldecoding-explanation.html#id-1.8.14.8.5

Best,
Flavio

Re: Logical replication performance

Reply via email to