On Wed, Apr 13, 2022 at 6:50 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Wed, Apr 13, 2022 at 2:38 PM Dilip Kumar <dilipbal...@gmail.com> wrote: > > > > On Tue, Apr 12, 2022 at 4:25 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > > The *initial* DDL replication is a different problem than DDL > > > > replication. The > > > > former requires a snapshot to read the current catalog data and build a > > > > CREATE > > > > command as part of the subscription process. The subsequent DDLs in > > > > that object > > > > will be handled by a different approach that is being discussed here. > > > > > > > > > > I think they are not completely independent because of the current way > > > to do initial sync followed by replication. The initial sync and > > > replication need some mechanism to ensure that one of those doesn't > > > overwrite the work done by the other. Now, the initial idea and patch > > > can be developed separately but I think both the patches have some > > > dependency. > > > > I agree with the point that their design can not be completely > > independent. They have some logical relationship of what schema will > > be copied by the initial sync and where is the exact boundary from > > which we will start sending as replication. And suppose first we only > > plan to implement the replication part then how the user will know > > what all schema user has to create and what will be replicated using > > DDL replication? Suppose the user takes a dump and copies all the > > schema and then creates the subscription, then how we are we going to > > handle the DDL concurrent to the subscription command? > > > > Right, I also don't see how it can be done in the current > implementation. So, I think even if we want to develop these two as > separate patches they need to be integrated to make the solution > complete.
It would be better to develop them separately in terms of development speed but, yes, we perhaps need to integrate them at some points. I think that the initial DDL replication can be done when the relation's state is SUBREL_STATE_INIT. That is, at the very beginning of the table synchronization, the syncworker copies the table schema somehow, then starts the initial data copy. After that, syncworker or applyworker applies DML/DDL changes while catching up and streaming changes, respectively. Probably we can have it optional whether to copy schema only, data only, or both. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/