[ https://issues.apache.org/jira/browse/FLINK-37327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17928447#comment-17928447 ]
Kevin Lam commented on FLINK-37327: ----------------------------------- This Jira ticket is about a change to the Deserialization/Decoder logic, supporting UPSERT (ChangelogMode=[I,UA,D]) mode. I propose to be able to skip emitting the UPDATE_BEFORE Rows from a DynamicTableSource using DebeziumAvroFormat, so that downstream operators can save on processing the UPDATE_BEFORE Rows. This is fine when downstream operators do not need the UPDATE_BEFORE retractions for correctness. Timo Walther discusses this optimization in this presentation as well: https://youtu.be/iRlLaY-P6iE?si=2QTlrVnh-iEXzYPb&t=1176 On the Serialization/Encoder side, there would be no change in behaviour other than the advertised ChangelogMode, since as you said "Flink encodes UPDATE_BEFORE and UDPATE_AFTER as DELETE and INSERT Debezium messages", there's no UPDATE_BEFORE Rows to skip. Does that make sense? > Debezium Avro Format: Add FormatOption to Optionally Skip emitting > UPDATE_BEFORE Rows > ------------------------------------------------------------------------------------- > > Key: FLINK-37327 > URL: https://issues.apache.org/jira/browse/FLINK-37327 > Project: Flink > Issue Type: Improvement > Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile) > Affects Versions: 1.20.1 > Reporter: Kevin Lam > Priority: Minor > Labels: pull-request-available > > Add a Format Option to the Debezium Format to optionally skip emitting the > UPDATE_BEFORE Rows when deserializing a Debezium message with op='u'. > This is helpful for Flink SQL applications that want to operate in UPSERT > (ChangelogMode=[I,UA,D]) mode and save on processing the UPDATE_BEFORE Rows > since the downstream sinks can handle it. -- This message was sent by Atlassian Jira (v8.20.10#820010)