Re: How does flink know which data is modified in dynamic table?

Hequn Cheng Mon, 20 Aug 2018 23:56:54 -0700

Hi，

You are right. We can make use of it to do soft delete.
But there will be problems in other cases. For example, retract messages by
the whole row. I opened a jira[1] about this problem. Thanks for bring up
this discussion.


[1] https://issues.apache.org/jira/browse/FLINK-10188

Best, Hequn

On Tue, Aug 21, 2018 at 12:34 PM, 徐涛 <happydexu...@gmail.com> wrote:

> Hi Hequn,
> Another question, for some case, I think update the timestamp of the
> retract row is reasonable, for example, some user does not want to the hard
> delete, but the soft delete, so I write code when the retract row comes I
> only do the soft delete, but I want the update_timestamp different so the
> ETL program can know that this line has changed.
>
>
>     For example, if the value is updated from 1 to 2,
>
> previous row:  add (a, 1, 2018-08-20 20:18:10.286)
> retract row: delete (a, 1, 2018-08-20 20:18:10.386)
> new row: add (a, 2, 2018-08-20 20:18:10.486)
>
>
> 在 2018年8月21日，下午12:25，Hequn Cheng <chenghe...@gmail.com> 写道：
>
> Hi Henry,
>
> You are right that, in MySQL, SYSDATE returns the time at which it
> executes while LOCALTIMESTAMP returns a constant time that indicates the
> time at which the statement began to execute.
> But other database system seems don't have this constraint(correct me if
> I'm wrong). Sometimes we don't have to follow MySQL.
>
> Best, Hequn
>
> On Tue, Aug 21, 2018 at 10:21 AM, 徐涛 <happydexu...@gmail.com> wrote:
>
>> Hi Hequn,
>> Maybe I do not express clearly. I mean if only the update_timestamp of
>> the increment data is updated, it is not enough. Because from the sql, it
>> express the idea “all the time in the table is the same”, but actually each
>> item in the table may be different. It is a bit weird.
>>
>> Best, Henry
>>
>>
>>
>> 在 2018年8月21日，上午10:09，Hequn Cheng <chenghe...@gmail.com> 写道：
>>
>> Hi Henry,
>>
>> If you upsert by key 'article_id', the result is correct, i.e, the result
>> is (a, 2, 2018-08-20 20:18:10.486). What do you think?
>>
>> Best, Hequn
>>
>>
>>
>> On Tue, Aug 21, 2018 at 9:44 AM, 徐涛 <happydexu...@gmail.com> wrote:
>>
>>> Hi Hequn,
>>> However is it semantically correct? because the sql result is not equal
>>> to the bounded table.
>>>
>>>
>>> 在 2018年8月20日，下午8:34，Hequn Cheng <chenghe...@gmail.com> 写道：
>>>
>>> Hi Henry,
>>>
>>> Both sql output incrementally.
>>>
>>> However there are some problems if you use retract sink. You have to pay
>>> attention to the timestamp field since each time the value is different.
>>> For example, if the value is updated from 1 to 2,
>>>
>>> previous row:  add (a, 1, 2018-08-20 20:18:10.286)
>>> retract row: delete (a, 1, 2018-08-20 20:18:10.386)
>>> new row: add (a, 2, 2018-08-20 20:18:10.486)
>>>
>>> The retract row is different from the previous row because of the time
>>> field.
>>>
>>> Of course, this problem should be fixed later.
>>>
>>> Best, Hequn
>>>
>>> On Mon, Aug 20, 2018 at 6:43 PM, 徐涛 <happydexu...@gmail.com> wrote:
>>>
>>>> Hi All,
>>>> Like the following code,If I use retract stream, I think Flink is able
>>>> to know which item is modified( if praise has 10000 items now, when one
>>>> item comes to the stream, only very small amount of data is write to sink)
>>>>
>>>>    var praiseAggr = tableEnv.sqlQuery(*s"SELECT article_id,hll(uid) as PU 
>>>> FROM praise group by article_id**”* )
>>>>
>>>>         tableEnv.registerTable("finalTable", praiseAggr)
>>>>
>>>>    tableEnv.sqlUpdate(s"insert into sinkTableName SELECT * from 
>>>> finalTable")
>>>>
>>>>
>>>>         But if I use the following sql, by adding a dynamic timestamp
>>>> field:
>>>> var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id,hll(uid) as 
>>>> PU,LOCALTIMESTAMP
>>>> as update_timestamp* FROM praise group by article_id**”* )
>>>>       Is the whole table flush to the sink? Or only the incremental
>>>> value will flush to the sink? Why?
>>>>
>>>> Thanks,
>>>> Henry
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: How does flink know which data is modified in dynamic table?

Reply via email to