On Thu, Aug 29, 2024 at 8:44 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Wed, Aug 28, 2024 at 1:06 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Mon, May 20, 2024 at 1:49 PM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > As Euler mentioned earlier, I think it's a decision not to replicate > > > generated columns because we don't know the target table on the > > > subscriber has the same expression and there could be locale issues > > > even if it looks the same. I can see that a benefit of this proposal > > > would be to save cost to compute generated column values if the user > > > wants the target table on the subscriber to have exactly the same data > > > as the publisher's one. Are there other benefits or use cases? > > > > > > > The cost is one but the other is the user may not want the data to be > > different based on volatile functions like timeofday() > > Shouldn't the generation expression be immutable? >
Yes, I missed that point. > > or the table on > > subscriber won't have the column marked as generated. > > Yeah, it would be another use case. > Right, apart from that I am not aware of other use cases. If they have, I would request Euler or Rajendra to share any other use case. > > Now, considering > > such use cases, is providing a subscription-level option a good idea > > as the patch is doing? I understand that this can serve the purpose > > but it could also lead to having the same behavior for all the tables > > in all the publications for a subscription which may or may not be > > what the user expects. This could lead to some performance overhead > > (due to always sending generated columns for all the tables) for cases > > where the user needs it only for a subset of tables. > > Yeah, it's a downside and I think it's less flexible. For example, if > users want to send both tables with generated columns and tables > without generated columns, they would have to create at least two > subscriptions. > Agreed and that would consume more resources. > Also, they would have to include a different set of > tables to two publications. > > > > > I think we should consider it as a table-level option while defining > > publication in some way. A few ideas could be: (a) We ask users to > > explicitly mention the generated column in the columns list while > > defining publication. This has a drawback such that users need to > > specify the column list even when all columns need to be replicated. > > (b) We can have some new syntax to indicate the same like: CREATE > > PUBLICATION pub1 FOR TABLE t1 INCLUDE GENERATED COLS, t2, t3, t4 > > INCLUDE ..., t5;. I haven't analyzed the feasibility of this, so there > > could be some challenges but we can at least investigate it. > > I think we can create a publication for a single table, so what we can > do with this feature can be done also by the idea you described below. > > > Yet another idea is to keep this as a publication option > > (include_generated_columns or publish_generated_columns) similar to > > "publish_via_partition_root". Normally, "publish_via_partition_root" > > is used when tables on either side have different partition > > hierarchies which is somewhat the case here. > > It sounds more useful to me. > Fair enough. Let's see if anyone else has any preference among the proposed methods or can think of a better way. -- With Regards, Amit Kapila.