[ 
https://issues.apache.org/jira/browse/FLINK-18835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174029#comment-17174029
 ] 

Jark Wu commented on FLINK-18835:
---------------------------------

Usually, we insert the group by query into an upsert sink (e.g. MySQL, HBase), 
the upsert sink will help us to keep only the last record.
If you don't want to see the duplicate updates, you can use window aggregation 
which produces append-only result. 

> sql using group by, duplicated group fileld appears
> ---------------------------------------------------
>
>                 Key: FLINK-18835
>                 URL: https://issues.apache.org/jira/browse/FLINK-18835
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 1.11.1
>            Reporter: YHF
>            Priority: Critical
>         Attachments: SumAnalysis.java
>
>
> datasource is kafka,then create a temporary view, group by (fieldA,fieldB) 
> using sql,
> then transform the result table to datastream using toRetractStream, then 
> print the result,
> I find duplicated (fieldA,fieldB)
> see attachment for code
> group by(scanType,scanSite,cmtInf),but result is below
> (true,Otm\{, scanType=97, scanSite=14, cmtInf=24,jp=1.000000000000000000, 
> db=0E-18, dbjp=1.000000000000000000, pjWei=27.070000000000000000, 
> dbWei=0E-18, mintime=2020-07-29 11:33:57.679, maxtime=2020-07-29 
> 11:33:57.679})
> 3> (true,Otm\{, scanType=97, scanSite=14, cmtInf=24,jp=1.000000000000000000, 
> db=0E-18, dbjp=1.000000000000000000, pjWei=27.070000000000000000, 
> dbWei=0E-18, mintime=2020-07-29 11:33:57.679, maxtime=2020-07-29 
> 11:33:57.679})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to