[ 
https://issues.apache.org/jira/browse/FLINK-38217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18014700#comment-18014700
 ] 

Sergey Nuyanzin commented on FLINK-38217:
-----------------------------------------

Merged as 
[788831b3f99c1328daff2eabb7e49ac019a4e924|https://github.com/apache/flink/commit/788831b3f99c1328daff2eabb7e49ac019a4e924]

> ChangelogNormalize unnecessarily emits updates for equal rows
> -------------------------------------------------------------
>
>                 Key: FLINK-38217
>                 URL: https://issues.apache.org/jira/browse/FLINK-38217
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Runtime
>    Affects Versions: 2.0.0, 1.19.3, 1.20.2
>            Reporter: Dawid Wysakowicz
>            Assignee: Sergey Nuyanzin
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.2.0
>
>
> DeduplicateFunctionHelper has logic for skipping rows if the previous emitted 
> row is the same as the new incoming record:
> * 
> https://github.com/apache/flink/blob/e36309a420c4c30ad98026c192881784edc58b7f/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/deduplicate/utils/DeduplicateFunctionHelper.java#L120
> * 
> https://github.com/apache/flink/blob/e36309a420c4c30ad98026c192881784edc58b7f/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/deduplicate/utils/DeduplicateFunctionHelper.java#L179
> Unfortunately it has a bug that it compares the entire Row including the 
> RowKind. The Row stored in the state has always {{INSERT}} RowKind whereas 
> the incoming records will have {{UPDATE_AFTER}} causing this optimisation to 
> never trigger.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to