On Wednesday, April 27, 2022 12:56 PM From: Amit Kapila <amit.kapil...@gmail.com> wrote: > On Tue, Apr 26, 2022 at 4:00 AM Tomas Vondra > <tomas.von...@enterprisedb.com> wrote: > > > > On 4/25/22 17:48, Alvaro Herrera wrote: > > > > > The desired result on subscriber is: > > > > > > table uno; > > > a │ b │ c > > > ────┼───┼─── > > > 1 │ 2 │ > > > -1 │ │ 4 > > > > > > > > > Thoughts? > > > > > > > I'm not quite sure which of the two behaviors is more "desirable". In a > > way, it's somewhat similar to publish_as_relid, which is also calculated > > not considering which of the row filters match? > > > > Right, or in other words, we check all publications to decide it and > similar is the case for publication actions which are also computed > independently for all publications. > > > But maybe you're right and it should behave the way you propose ... the > > example I have in mind is a use case replicating table with two types of > > rows - sensitive and non-sensitive. For sensitive, we replicate only > > some of the columns, for non-sensitive we replicate everything. Which > > could be implemented as two publications > > > > create publication sensitive_rows > > for table t (a, b) where (is_sensitive); > > > > create publication non_sensitive_rows > > for table t where (not is_sensitive); > > > > But the way it's implemented now, we'll always replicate all columns, > > because the second publication has no column list. > > > > Changing this to behave the way you expect would be quite difficult, > > because at the moment we build a single OR expression from all the row > > filters. We'd have to keep the individual expressions, so that we can > > build a column list for each of them (in order to ignore those that > > don't match). > > > > We'd have to remove various other optimizations - for example we can't > > just discard row filters if we found "no_filter" publication. > > > > I don't think that is the right way. We need some way to combine > expressions and I feel the current behavior is sane. I mean to say > that even if there is one publication that has no filter (column/row), > we should publish all rows with all columns. Now, as mentioned above > combining row filters or column lists for all publications appears to > be consistent with what we already do and seems correct behavior to > me. > > To me, it appears that the method used to decide whether a particular > table is published or not is also similar to what we do for row > filters or column lists. Even if there is one publication that > publishes all tables, we consider the current table to be published > irrespective of whether other publications have published that table > or not. > > > Or more > > precisely, we'd have to consider column lists too. > > > > In other words, we'd have to merge pgoutput_column_list_init into > > pgoutput_row_filter_init, and then modify pgoutput_row_filter to > > evaluate the row filters one by one, and build the column list. > > > > Hmm, I think even if we want to do something here, we also need to > think about how to achieve similar behavior for initial tablesync > which will be more tricky.
I think it could be difficult to make the initial tablesync behave the same. Currently, we make a "COPY" command to do the table sync, I am not sure how to change the "COPY" query to achieve the expected behavior here. BTW, For the use case mentioned here: """ replicating table with two types of rows - sensitive and non-sensitive. For sensitive, we replicate only some of the columns, for non-sensitive we replicate everything. """ One approach to do this is to create two subscriptions and two publications which seems a workaround. ----- create publication uno for table uno (a, b) where (a > 0); create publication dos for table uno (a, c) where (a < 0); create subscription sub_uno connection 'port=55432 dbname=alvherre' publication uno; create subscription sub_dos connection 'port=55432 dbname=alvherre' publication dos; ----- Best regards, Hou zj