On Tuesday, October 19, 2021 10:47 AM houzj.f...@fujitsu.com 
<houzj.f...@fujitsu.com> wrote:
> 
> On Monday, October 18, 2021 5:03 PM Amit Langote
> <amitlangot...@gmail.com> wrote:
> > I can imagine that the behavior seen here may look surprising, but not
> > sure if I would call it a bug as such.  I do remember thinking about
> > this case and the current behavior is how I may have coded it to be.
> >
> > Looking at this command in Hou-san's email:
> >
> >   create publication pub for table tbl1, tbl1_part1 with
> > (publish_via_partition_root=on);
> >
> > It's adding both the root partitioned table and the leaf partition
> > *explicitly*, and it's not clear to me if the latter's inclusion in
> > the publication should be assumed because the former is found to have
> > been added to the publication, that is, as far as the latter's
> > visibility to the subscriber is concerned.  It's not a stretch to
> > imagine that a user may write the command this way to account for a
> > subscriber node on which tbl1 and tbl1_part1 are unrelated tables.
> >
> > I don't think we assume anything on the publisher side regarding the
> > state/configuration of tables on the subscriber side, at least with
> > publication commands where tables are added to a publication
> > explicitly, so it is up to the user to make sure that the tables are
> > not added duplicatively.  One may however argue that the way we've
> > decided to handle FOR ALL TABLES does assume something about
> > partitions where it skips advertising them to subscribers when
> > publish_via_partition_root flag is set to true, but that is exactly to
> > avoid the duplication of data that goes to a subscriber.
> 
> Hi,
> 
> Thanks for the explanation.
> 
> I think one reason that I consider this behavior a bug is that: If we add
> both the root partitioned table and the leaf partition explicitly to the
> publication (and set publish_via_partition_root = on), the behavior of the
> apply worker is inconsistent with the behavior of table sync worker.
> 
> In this case, all changes in the leaf the partition will be applied using the
> identity and schema of the partitioned(root) table. But for the table sync, it
> will execute table sync for both the leaf and the root table which cause
> duplication of data.
> 
> Wouldn't it be better to make the behavior consistent here ?
> 

I agree with this point. 

About this case,

> >   create publication pub for table tbl1, tbl1_part1 with
> > (publish_via_partition_root=on);

As a user, although partitioned table includes the partition, publishing 
partitioned
table and its partition is allowed. So, I think we should take this case into
consideration. Initial data is copied once via the parent table seems 
reasonable.

Regards
Shi yu

Reply via email to