[ https://issues.apache.org/jira/browse/FLINK-25471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17470277#comment-17470277 ]
Yao Zhang commented on FLINK-25471: ----------------------------------- Hi [~twalthr] , In Flink 1.13 it transforms sum to StreamGroupedReduceOperator even though it is executed in batch mode, which is exactly a bug. Conversely in Flink 1.14 it turns out to be BatchGroupedReduceOperator and it is correct. But there are still some issues for BatchGroupedReduceOperator. It will never output the latest accumulated value right before task manager shuts down because it only register an EventTimeTimer when the next element with the different key arrives. That is the reason why in batch mode some elements may be missing. I would like to fix it. > wrong result if table transfrom to DataStream then keyby sum in Batch Mode > --------------------------------------------------------------------------- > > Key: FLINK-25471 > URL: https://issues.apache.org/jira/browse/FLINK-25471 > Project: Flink > Issue Type: Bug > Components: Table SQL / API, Table SQL / Runtime > Affects Versions: 1.14.2 > Environment: mac book pro m1 > jdk 8 > scala 2.11 > flink 1.14.2 > idea 2020 > Reporter: zhangzh > Priority: Critical > Attachments: TableToDataStreamBatchWordCount-1.scala, pom.xml > > > I have a dataStream with 6 lines datas like this: > Row.of("Alice"), > Row.of("alice"), > Row.of("Bob"), > Row.of("lily"), > Row.of("lily"), > Row.of("lily") > then make it to table with one column "word" > then sql transform : select upper(word) from tmp_table > then change to dataStream > then keyby sum. > > in batch mode: > I think correct result is: > > (BOB,1) > > (ALICE,2) > > (LILY,3) > > but the result is : > > (BOB,1) > if i set different parallelism ,the result is different. > > the source file and pom is in attach. > is a bug? > pelease help me!!! > > > > > > > > > -- This message was sent by Atlassian Jira (v8.20.1#820001)