RE: pg_get_publication_tables() output duplicate relid

houzj.f...@fujitsu.com Mon, 22 Nov 2021 17:58:30 -0800

On Mon, Nov 22, 2021 6:48 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> On Mon, Nov 22, 2021 at 1:45 PM Amit Langote <amitlangot...@gmail.com>
> wrote:
> >
> > On Sat, Nov 20, 2021 at 8:31 PM Amit Kapila <amit.kapil...@gmail.com>
> wrote:
> > > On Fri, Nov 19, 2021 at 10:58 AM Amit Kapila <amit.kapil...@gmail.com>
> wrote:
> > > > On Fri, Nov 19, 2021 at 7:19 AM Amit Langote
> <amitlangot...@gmail.com> wrote:
> > > > > The problematic case is attaching the partition *after* the
> > > > > subscriber has already marked the root parent as synced and/or
> > > > > ready for replication.  Refreshing the subscription doesn't help
> > > > > it discover the newly attached partition, because a
> > > > > publish_via_partition_root only ever tells about the root
> > > > > parent, which would be already synced, so the subscriber would think
> > > > > there's nothing to copy.
> > > >
> > > > Okay, I see this could be a problem but I haven't tried to reproduce it.
> > > >
> > > > > > Anyway, if this is a problem
> > > > > > we need to figure the solution for this separately.
> > > > >
> > > > > Sure, we might need to do that after all.  Though it might be a
> > > > > good idea to be sure that we won't need to reconsider the fix we
> > > > > push for the issue(s) being discussed here and elsewhere,
> > > > > because I suspect that the solution to the problem I mentioned
> > > > > is likely to involve tweaking pg_publication_tables view output.
> > >
> > > I have thought about this problem and I see two possibilities for a
> > > solution (a) We could provide a new option say 'truncate' (something
> > > on lines proposed here [1]) which would truncate the table(s) and
> > > change its status to 'i' in the pg_subscription_rel, this would
> > > allow the newly added partition to be synced after refresh. This
> > > could lead to a large copy in such a case.
> >
> > Maybe I am missing something about the proposal, though I'd think a
> > more automatic solution would be better, something that doesn't need
> > to rely on an unrelated feature.
> >
> 
> Agreed, this was more of a workaround for users if we didn't get any automatic
> solution.
> 
> > > (b) We could somehow get and store all the partition info from the
> > > publisher-side on the subscriber-side while initial sync (say in new
> > > system table pg_subscription_rel_members). Now, after the refresh,
> > > if this list changes, we can allow to just get the data of that
> > > particular partition but I guess it would mean that we need to store
> > > oids of the publisher which might or might not be safe considering
> > > oids can wraparound before the refresh.
> > >
> > > Do you have any other ideas?
> >
> > I thought that the idea I had earlier mentioned at [1] may be useful,
> > which I can see is similar to your idea (b). I also suspect that it
> > can be implemented without needing a separate catalog and storing
> > publication-side relation OIDs in the subscription-side catalog,
> > though maybe I haven't thought hard enough.
> >
> 
> The problem with storing this info in pg_subscription_rel as you were
> describing in your proposal is that currently, we ensure that the same table
> exists in subscriber and then store the subscriber side table id in that 
> catalog. I
> am not sure if we can store publisher-side oids in that catalog and if we 
> store
> then it would be confusing as now it will have info of both publisher-side 
> oids
> and subscriber-side oids. Isn't that a problem with this approach?
> 
> One more problem with this overall approach is the risk of OID wraparound.
> Say between Create Subscription and Alter Subscription ..
> Refresh, we detach one partition, the oid wraparounds, and we create/reattach
> another partition which gets the same oid as for the partition which we
> detached earlier then we won't be able to identify the new partition even 
> after
> refresh.


If we use this approach, I think maybe we can store the publisher-side's table
name instead of oid. Because for non-partitioned table, it's possible that user
drop and create a new table with the same name in which case the oid would be
changed. And in this case, the existing behavior wouldn't sync the table again.
So I think it might be better to make the partitioned table's behavior
consistent with the non-partitioned table which only sync the table's data when
detect new table name.

Best regards,
Hou zj

RE: pg_get_publication_tables() output duplicate relid

Reply via email to