On Wed, Oct 13, 2021 at 9:11 AM Amit Langote <amitlangot...@gmail.com> wrote: > > Hi Amit, > > On Fri, Oct 8, 2021 at 12:47 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Thu, Oct 7, 2021 at 12:39 PM Amit Langote <amitlangot...@gmail.com> > > wrote: > > > Sorry that I didn't comment on this earlier, but I think either > > > GetPubPartitionOptionRelations() or InvalidatePublicationRels() > > > introduced in the commit 4548c76738b should lock the partitions, just > > > like to the parent partitioned table would be, before invalidating > > > them. There may be some hazards to invalidating a relation without > > > locking it. > > > > I see your point but then on the same lines didn't the existing code > > "for all tables" case (where we call CacheInvalidateRelcacheAll() > > without locking all relations) have a similar problem. > > There might be. I checked to see how other callers/modules use > CacheInvalidateRelcacheAll(), though it seems that only the functions > in publicationcmds.c use it or really was invented in 665d1fad99e for > use by publication commands. > > Maybe I need to look harder than I've done for any examples of hazard. > > > Also, in your > > patch, you are assuming that the callers of GetPublicationRelations() > > will lock all the relations but what when it gets called from > > AlterPublicationOptions()? > > Ah, my bad. I hadn't noticed that one for some reason. > > Now that you mention it, I do find it somewhat concerning (on the > similar grounds as what prompted my previous email) that > AlterPublicationOptions() does away with any locking on the affected > relations. > > Anyway, I'll think a bit more about the possible hazards of not doing > the locking and will reply again if there's indeed a problem(s) that > needs to be fixed. >
I think you can try to reproduce the problem via the debugger. You can stop before calling GetPubPartitionOptionRelations in publication_add_relation() in session-1 and then from another session (say session-2) try to delete one of the partition table (without replica identity). Then stop in session-2 somewhere after acquiring lock to the corresponding partition relation. Now, continue in session-1 and invalidate the rels and let it complete the command. I think session-2 will complete the update without processing the invalidations. If the above is true, then, this breaks the following behavior specified in the documentation: "The tables added to a publication that publishes UPDATE and/or DELETE operations must have REPLICA IDENTITY defined. Otherwise, those operations will be disallowed on those tables.". Also, I think such updates won't be replicated on subscribers as there is no replica identity or primary key column. -- With Regards, Amit Kapila.