On 2017-02-11 11:16, Erik Rijkers wrote:
On 2017-02-08 23:25, Petr Jelinek wrote:

0001-Use-asynchronous-connect-API-in-libpqwalreceiver-v2.patch
0002-Always-initialize-stringinfo-buffers-in-walsender-v2.patch
0003-Fix-after-trigger-execution-in-logical-replication-v2.patch
0004-Add-RENAME-support-for-PUBLICATIONs-and-SUBSCRIPTION-v2.patch
0001-Logical-replication-support-for-initial-data-copy-v4.patch

This often works but it also fails far too often (in my hands).  I
test whether the tables are identical by comparing an md5 from an
ordered resultset, from both replica and master.  I estimate that 1 in
5 tries fail; 'fail'  being a somewhat different table on replica
(compared to mater), most often pgbench_accounts (typically there are
10-30 differing rows).  No errors or warnings in either logfile.   I'm
not sure but I think testing on faster machines seem to be doing
somewhat better ('better' being less replication error).


I have noticed that when I insert a few seconds wait-state after the create subscription (or actually: the 'enable'ing of the subscription) the problem does not occur. Apparently, (I assume) the initial snapshot occurs somewhere when the subsequent pgbench-run has already started, so that the logical replication also starts somewhere 'into' that pgbench-run. Does that make sense?

I don't know what to make of it. Now that I think that I understand what happens I hesitate to call it a bug. But I'd say it's still a useability problem that the subscription is only 'valid' after some time, even if it's only a few seconds.

(the other problem I mentioned (drop subscription hangs) still happens every now and then)


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to