On Tue, May 3, 2022 at 2:15 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, May 2, 2022 at 5:06 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Mon, May 2, 2022 at 6:09 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > On Mon, May 2, 2022 at 11:47 AM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > > > > > Are you planning to support "Transaction dependency" Amit mentioned in > > > > his first mail in this patch? IIUC since the background apply worker > > > > applies the streamed changes as soon as receiving them from the main > > > > apply worker, a conflict that doesn't happen in the current streaming > > > > logical replication could happen. > > > > > > > > > > This patch seems to be waiting for stream_stop to finish, so I don't > > > see how the issues related to "Transaction dependency" can arise? What > > > type of conflict/issues you have in mind? > > > > Suppose we set both publisher and subscriber: > > > > On publisher: > > create table test (i int); > > insert into test values (0); > > create publication test_pub for table test; > > > > On subscriber: > > create table test (i int primary key); > > create subscription test_sub connection '...' publication test_pub; -- > > value 0 is replicated via initial sync > > > > Now, both 'test' tables have value 0. > > > > And suppose two concurrent transactions are executed on the publisher > > in following order: > > > > TX-1: > > begin; > > insert into test select generate_series(0, 10000); -- changes will be > > streamed; > > > > TX-2: > > begin; > > delete from test where c = 0; > > commit; > > > > TX-1: > > commit; > > > > With the current streaming logical replication, these changes will be > > applied successfully since the deletion is applied before the > > (streamed) insertion. Whereas with the apply bgworker, it fails due to > > an unique constraint violation since the insertion is applied first. > > I've confirmed that it happens with v5 patch. > > > > Good point but I am not completely sure if doing transaction > dependency tracking for such cases is really worth it. I feel for such > concurrent cases users can anyway now also get conflicts, it is just a > matter of timing. One more thing to check transaction dependency, we > might need to spill the data for streaming transactions in which case > we might lose all the benefits of doing it via a background worker. Do > we see any simple way to avoid this? >
Avoiding unexpected differences like this is why I suggested the option should have to be explicitly enabled instead of being on by default as it is in the current patch. See my review comment #14 [1]. It means the user won't have to change their existing code as a workaround. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPuqYP5eD5wcSCtk%3Da6KuMjat2UCzqyGoE7sieCaBsVskQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia