Link for Paimon LocalMerge Operator[1] [1] https://paimon.apache.org/docs/master/maintenance/write-performance/#local-merging
xiangyu feng <xiangyu...@gmail.com> 于2025年2月11日周二 14:03写道: > Follow the above, > > "And for SinkWriter, the data structure to be processed should be fixed." > > I'm not very sure why the data structure of SinkWriter should be fixed. > Can you elaborate the scenario here? > > "Is there a node or an operator to fill in the inconsistent field of > Rowdata that passed from different Sources?" > > By `filling in the inconsistent field from different sources`, do you > refer to implementations like the LocalMerge Operator [1] for Paimon? IMHO, > this should not be included in the Sink Reuse. The merging behavior of > multiple sources should be considered inside of the sink. > > Regards, > Xiangyu Feng > > xiangyu feng <xiangyu...@gmail.com> 于2025年2月11日周二 13:46写道: > >> Hi Yanquan, >> >> Thx for reply. IIUC, the schema of CatalogTable should contain all target >> columns for sources. If not, a SQL validation exception should be raised >> for planner. >> >> Regards, >> Xiangyu Feng >> >> >> >> Yanquan Lv <decq12y...@gmail.com> 于2025年2月10日周一 16:25写道: >> >>> Hi, Xiangyu. Thanks for driving this. >>> >>> I have a question to confirm: >>> Considering the case that different Sources use different columns[1], >>> will the Schema of CatalogTable[2] contain all target columns for Sources? >>> And for SinkWriter, the data structure to be processed should be fixed. >>> Is there a node or an operator to fill in the inconsistent field of Rowdata >>> that passed from different Sources? >>> >>> [1] >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-506%3A+Support+Reuse+Multiple+Table+Sinks+in+Planner >>> [2] >>> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sourcessinks/#planning >>> >>> >>> >>> > 2025年2月6日 17:06,xiangyu feng <xiangyu...@gmail.com> 写道: >>> > >>> > Hi devs, >>> > >>> > I'm opening this thread to discuss FLIP-506: Support Reuse Multiple >>> Table >>> > Sinks in Planner[1]. >>> > >>> > Currently if users want to partial-update a downstream table from >>> multiple >>> > source tables in one datastream, they would have to manually union all >>> > source tables and add lots of "cast(null as string) as xxx" in Flink >>> SQL. >>> > This will make the SQL here hard to use and maintain. >>> > >>> > After discussing with Weijie Guo, we think that by supporting reuse >>> sink >>> > nodes in planner, the usability can be greatly improved in this case. >>> > >>> > Therefore, we propose to add a new option >>> > *`table.optimizer.reuse-sink-enabled`* here to support this feature. >>> More >>> > details can be found in the FLIP. >>> > >>> > Looking forward to your feedback, thanks. >>> > >>> > [1] >>> > >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-506%3A+Support+Reuse+Multiple+Table+Sinks+in+Planner >>> > >>> > Best regards, >>> > Xiangyu Feng >>> >>>