On Wed, Oct 20, 2021 at 7:02 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > Actually, at least with the scenario I gave steps for, after looking > > at it again and debugging, I think that the behavior is understandable > > and not a bug. > > The reason is that the INSERTed data is first published though the > > partitions, since initially there is no partitioned table in the > > publication (so publish_via_partition_root=true doesn't have any > > effect). But then adding the partitioned table to the publication and > > refreshing the publication in the subscriber, the data is then > > published "using the identity and schema of the partitioned table" due > > to publish_via_partition_root=true. Note that the corresponding table > > in the subscriber may well be a non-partitioned table (or the > > partitions arranged differently) so the data does need to be > > replicated again. > > I don't think this behavior is consistent, I mean for the initial sync > we will replicate the duplicate data, whereas for later streaming we > will only replicate it once. From the user POW, this behavior doesn't > look correct. >
The scenario I gave steps for didn't have any table data when the subscription was made, so the initial sync did not replicate any data. I was referring to the double-publish that occurs when publish_via_partition_root=true and then the partitioned table is added to the publication and the subscriber does ALTER SUBSCRIPTION ... REFRESH PUBLICATION. If I modify my example to include both the partitioned table and (explicitly) its child partitions in the publication, and insert some data on the publisher side prior to the subscription, then I am seeing duplicate data on the initial sync on the subscriber side, and I would agree that this doesn't seem correct. Regards, Greg Nancarrow Fujitsu Australia