On Wed, Oct 20, 2021 at 1:32 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > On Wed, Oct 20, 2021 at 12:44 PM Greg Nancarrow <gregn4...@gmail.com> wrote: > > > > On Mon, Oct 18, 2021 at 5:00 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > > I have not debugged it yet to find out why, but with the patch > > > > applied, the original double-publish problem that I reported > > > > (converted to just use TABLE rather than ALL TABLES IN SCHEMA) still > > > > occurs. > > > > > > > > > > Yeah, I think this is a variant of the problem being fixed by > > > Hou-San's patch. I think one possible idea to investigate is that on > > > the subscriber-side, after fetching tables, we check the already > > > subscribed tables and if the child tables already exist then we ignore > > > the parent table and vice versa. We might want to consider the case > > > where a user has toggled the "publish_via_partition_root" parameter. > > > > > > It seems both these behaviours/problems exist since commit 17b9e7f9 > > > (Support adding partitioned tables to publication). Adding Amit L and > > > Peter E (people involved in this work) to know their opinion? > > > > > > > Actually, at least with the scenario I gave steps for, after looking > > at it again and debugging, I think that the behavior is understandable > > and not a bug. > > The reason is that the INSERTed data is first published though the > > partitions, since initially there is no partitioned table in the > > publication (so publish_via_partition_root=true doesn't have any > > effect). But then adding the partitioned table to the publication and > > refreshing the publication in the subscriber, the data is then > > published "using the identity and schema of the partitioned table" due > > to publish_via_partition_root=true. Note that the corresponding table > > in the subscriber may well be a non-partitioned table (or the > > partitions arranged differently) so the data does need to be > > replicated again. >
Even if the partitions are arranged differently why would the user expect the same data to be replicated twice? > I don't think this behavior is consistent, I mean for the initial sync > we will replicate the duplicate data, whereas for later streaming we > will only replicate it once. From the user POW, this behavior doesn't > look correct. > +1. -- With Regards, Amit Kapila.