Hi, You are right. We can make use of it to do soft delete. But there will be problems in other cases. For example, retract messages by the whole row. I opened a jira[1] about this problem. Thanks for bring up this discussion.
[1] https://issues.apache.org/jira/browse/FLINK-10188 Best, Hequn On Tue, Aug 21, 2018 at 12:34 PM, 徐涛 <happydexu...@gmail.com> wrote: > Hi Hequn, > Another question, for some case, I think update the timestamp of the > retract row is reasonable, for example, some user does not want to the hard > delete, but the soft delete, so I write code when the retract row comes I > only do the soft delete, but I want the update_timestamp different so the > ETL program can know that this line has changed. > > > For example, if the value is updated from 1 to 2, > > previous row: add (a, 1, 2018-08-20 20:18:10.286) > retract row: delete (a, 1, 2018-08-20 20:18:10.386) > new row: add (a, 2, 2018-08-20 20:18:10.486) > > > 在 2018年8月21日,下午12:25,Hequn Cheng <chenghe...@gmail.com> 写道: > > Hi Henry, > > You are right that, in MySQL, SYSDATE returns the time at which it > executes while LOCALTIMESTAMP returns a constant time that indicates the > time at which the statement began to execute. > But other database system seems don't have this constraint(correct me if > I'm wrong). Sometimes we don't have to follow MySQL. > > Best, Hequn > > On Tue, Aug 21, 2018 at 10:21 AM, 徐涛 <happydexu...@gmail.com> wrote: > >> Hi Hequn, >> Maybe I do not express clearly. I mean if only the update_timestamp of >> the increment data is updated, it is not enough. Because from the sql, it >> express the idea “all the time in the table is the same”, but actually each >> item in the table may be different. It is a bit weird. >> >> Best, Henry >> >> >> >> 在 2018年8月21日,上午10:09,Hequn Cheng <chenghe...@gmail.com> 写道: >> >> Hi Henry, >> >> If you upsert by key 'article_id', the result is correct, i.e, the result >> is (a, 2, 2018-08-20 20:18:10.486). What do you think? >> >> Best, Hequn >> >> >> >> On Tue, Aug 21, 2018 at 9:44 AM, 徐涛 <happydexu...@gmail.com> wrote: >> >>> Hi Hequn, >>> However is it semantically correct? because the sql result is not equal >>> to the bounded table. >>> >>> >>> 在 2018年8月20日,下午8:34,Hequn Cheng <chenghe...@gmail.com> 写道: >>> >>> Hi Henry, >>> >>> Both sql output incrementally. >>> >>> However there are some problems if you use retract sink. You have to pay >>> attention to the timestamp field since each time the value is different. >>> For example, if the value is updated from 1 to 2, >>> >>> previous row: add (a, 1, 2018-08-20 20:18:10.286) >>> retract row: delete (a, 1, 2018-08-20 20:18:10.386) >>> new row: add (a, 2, 2018-08-20 20:18:10.486) >>> >>> The retract row is different from the previous row because of the time >>> field. >>> >>> Of course, this problem should be fixed later. >>> >>> Best, Hequn >>> >>> On Mon, Aug 20, 2018 at 6:43 PM, 徐涛 <happydexu...@gmail.com> wrote: >>> >>>> Hi All, >>>> Like the following code,If I use retract stream, I think Flink is able >>>> to know which item is modified( if praise has 10000 items now, when one >>>> item comes to the stream, only very small amount of data is write to sink) >>>> >>>> var praiseAggr = tableEnv.sqlQuery(*s"SELECT article_id,hll(uid) as PU >>>> FROM praise group by article_id**”* ) >>>> >>>> tableEnv.registerTable("finalTable", praiseAggr) >>>> >>>> tableEnv.sqlUpdate(s"insert into sinkTableName SELECT * from >>>> finalTable") >>>> >>>> >>>> But if I use the following sql, by adding a dynamic timestamp >>>> field: >>>> var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id,hll(uid) as >>>> PU,LOCALTIMESTAMP >>>> as update_timestamp* FROM praise group by article_id**”* ) >>>> Is the whole table flush to the sink? Or only the incremental >>>> value will flush to the sink? Why? >>>> >>>> Thanks, >>>> Henry >>>> >>>> >>> >>> >> >> > >