[ 
https://issues.apache.org/jira/browse/FLINK-19452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhengchao Shi closed FLINK-19452.
---------------------------------
    Resolution: Not A Bug

not a problem

> statistics of group by CDC data is always 1
> -------------------------------------------
>
>                 Key: FLINK-19452
>                 URL: https://issues.apache.org/jira/browse/FLINK-19452
>             Project: Flink
>          Issue Type: Bug
>          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
>    Affects Versions: 1.11.1
>            Reporter: Zhengchao Shi
>            Priority: Major
>             Fix For: 1.12.0
>
>
> When using CDC to do count statistics, if only updates are made to the source 
> table(mysql table), then the value of count is always 1.
> {code:sql}
> CREATE TABLE orders (
>   order_number int,
>   product_id   int
> ) with (
>   'connector' = 'kafka-0.11',
>   'topic' = 'Topic',
>   'properties.bootstrap.servers' = 'localhost:9092',
>   'properties.group.id' = 'GroupId',
>   'scan.startup.mode' = 'latest-offset',
>   'format' = 'canal-json'
> );
> CREATE TABLE order_test (
>   order_number int,
>   order_cnt bigint
> ) WITH (
>   'connector' = 'print'
> );
> INSERT INTO order_test
> SELECT order_number, count(1) FROM orders GROUP BY order_number;
> {code}
> 3 records in  “orders” :
> ||order_number||product_id||
> |10001|1|
> |10001|2|
> |10001|3|
>  now update orders table:
> {code:sql}
> update orders set product_id = 5 where order_number = 10001;
> {code}
> the output of is :
> -D(10001,1)
>  +I(10001,1)
>  -D(10001,1)
>  +I(10001,1)
>  -D(10001,1)
>  +I(10001,1)
> i think, the final result is +I(10001, 3)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to