It has been 4 hours and it is safe to say that the measurements we took have a huge positive effect: > 30 times faster and no noticeable effect on the running Primary at all. A 20GB table is now replicated under 10 minutes.
- We removed all non PK and unique indices from the large tables - We the changed the query on the queue table to add the 'SKIP LOCKED' clause. - We do a per table approach for the larger tables. I think the indices have the most significant impact, but not sure how to proof this since we did multiple changes at the same time. Thanks again for the tips! -- Lars On Thu, Dec 10, 2020 at 9:12 AM Lars Vonk <lars.v...@gmail.com> wrote: > Hi, > > - on the receiving side, avoid creating indexes on the tables: create just >> a necessary PK or UK, wait for the initial load to complete and then add >> all the rest ones >> > > Thanks, this is a good tip. We are going to add this > > We also noticed the code that was getting the next from the "queue" was > doing the query with 'select for update', but without a 'SKIP LOCKED'. This > is probably also something that caused more wait time if the server is > more busy as usual during the replication. So we are going to add this and > try again. We are also minimizing load on the queue during initial > replication. > > On to the next try. > > Lars > > On Wed, Dec 9, 2020 at 6:45 PM Victor Yegorov <vyego...@gmail.com> wrote: > >> ср, 9 дек. 2020 г. в 10:21, Lars Vonk <lars.v...@gmail.com>: >> >>> We are doing a logical postgres replication from Postgres 11 to 12. Our >>> database is around 700GB (8 cpu's, 32 GB). >>> During the replication process, at some point, we see a huge performance >>> penalty on a particular table. This table acts as a queue with lots of >>> inserts and deletes happening throughout the day. For most of the time this >>> table is empty, but during this performance penalty the number of rows in >>> this table grows to 10.000 rows, and processing is not fast enough to empty >>> this table. Main reason for this (as far as we see) is that the performance >>> of the query for selecting the next row to process drops from < 10MS to >>> 400MS. This eventually causes too much cpu load on the Primary and we have >>> to cancel the replication process. >>> >> We already tried the initial load three times, and it consistently fails >>> with the same "error". Last try was a per table approach and excluding this >>> "queue" table. >>> After cancelling the replication the query is fast again and the load on >>> the Primary goes back to normal. We see that this happens when replicating >>> large tables (> millions of rows). During this performance penalty the >>> explain of the query selecting the next row from this table tells us it is >>> doing a sequential scan (there is an index but it is not used). >>> >>> - What could cause this performance penalty? >>> - Is this something other people experienced as well during the initial >>> load of a logical replication with large tables? >>> - We are now thinking of temporarily increasing the number of CPU's and >>> RAM for the migration. Would this help in this case? >>> >> >> I've seen similar symptoms in cases with (a) home-made queues in the >> tables and (b) long transactions. >> Unfortunately, queue requires frequent vacuuming to preserve more or less >> constant size of the queue and it's indexes. >> And long transactions prevent the vacuum from cleaning up the queue. >> Initial synchronization phase of the logical replication is in fact such a >> transaction. >> >> I would recommend doing the following: >> - avoid adding ALL tables to the publication >> - instead, split all tables in a batches in such a way, that initial >> batch processing takes limited time (say, 15-30 minutes at most) >> - of course, this leaves the biggest tables alone — add those one by one >> to the publication, preferably at the time slot with minimal load on the >> queue. >> - make sure to catch up on the queue processing and vacuum it between >> batches >> - on the receiving side, avoid creating indexes on the tables: create >> just a necessary PK or UK, wait for the initial load to complete and then >> add all the rest ones >> >> As for the queue, PGQ from skytools is using different approach to >> maintain queue tables: >> - once in a while (2 hours by default) processing is switched to a new >> table, tab_1, tab_2, tab_3 are used in a round >> - after the switch, any remaining entries can be moved from previous to >> the live table (shouldn't be necessary if switch is done properly, although >> might be tricky in a presence of a long transactions) >> - previous table is TRUNCATEd >> >> In your case, you can do `VACUUM FULL` between replicating each batch of >> tables. >> >> -- >> Victor Yegorov >> >