On Tuesday, October 19, 2021 10:47 AM houzj.f...@fujitsu.com <houzj.f...@fujitsu.com> wrote: > > On Monday, October 18, 2021 5:03 PM Amit Langote > <amitlangot...@gmail.com> wrote: > > I can imagine that the behavior seen here may look surprising, but not > > sure if I would call it a bug as such. I do remember thinking about > > this case and the current behavior is how I may have coded it to be. > > > > Looking at this command in Hou-san's email: > > > > create publication pub for table tbl1, tbl1_part1 with > > (publish_via_partition_root=on); > > > > It's adding both the root partitioned table and the leaf partition > > *explicitly*, and it's not clear to me if the latter's inclusion in > > the publication should be assumed because the former is found to have > > been added to the publication, that is, as far as the latter's > > visibility to the subscriber is concerned. It's not a stretch to > > imagine that a user may write the command this way to account for a > > subscriber node on which tbl1 and tbl1_part1 are unrelated tables. > > > > I don't think we assume anything on the publisher side regarding the > > state/configuration of tables on the subscriber side, at least with > > publication commands where tables are added to a publication > > explicitly, so it is up to the user to make sure that the tables are > > not added duplicatively. One may however argue that the way we've > > decided to handle FOR ALL TABLES does assume something about > > partitions where it skips advertising them to subscribers when > > publish_via_partition_root flag is set to true, but that is exactly to > > avoid the duplication of data that goes to a subscriber. > > Hi, > > Thanks for the explanation. > > I think one reason that I consider this behavior a bug is that: If we add > both the root partitioned table and the leaf partition explicitly to the > publication (and set publish_via_partition_root = on), the behavior of the > apply worker is inconsistent with the behavior of table sync worker. > > In this case, all changes in the leaf the partition will be applied using the > identity and schema of the partitioned(root) table. But for the table sync, it > will execute table sync for both the leaf and the root table which cause > duplication of data. > > Wouldn't it be better to make the behavior consistent here ? >
I agree with this point. About this case, > > create publication pub for table tbl1, tbl1_part1 with > > (publish_via_partition_root=on); As a user, although partitioned table includes the partition, publishing partitioned table and its partition is allowed. So, I think we should take this case into consideration. Initial data is copied once via the parent table seems reasonable. Regards Shi yu