Thanks for the discussion, JING ZHANG. I like the first proposal since it is simple and consistent with dataStream API. It is helpful to add more docs about the special late case in WindowAggregate. Also, I expect the more flexible emit strategies later.
Jark Wu <imj...@gmail.com> 于2021年7月2日周五 上午10:33写道: > Sorry, I made a typo above. I mean I prefer proposal (1) that > only needs to set `table.exec.emit.allow-lateness` to handle late events. > `table.exec.emit.late-fire.delay` can be optional which is 0s by default. > `table.exec.state.ttl` will not affect window state anymore, so window > state > is still cleaned accurately by watermark. > > We don't need to expose `table.exec.emit.late-fire.enabled` on docs and > can remove it in the next version. > > Best, > Jark > > On Thu, 1 Jul 2021 at 21:20, Jark Wu <imj...@gmail.com> wrote: > > > Thanks Jing for bringing up this topic, > > > > The emit strategy configs are annotated as Experiential and not public on > > documentations. > > However, I see this is a very useful feature which many users are looking > > for. > > I have posted these configs for many questions like "how to handle late > > events in SQL". > > Thus, I think it's time to make the configuration public and explicitly > > document it. In the long > > term, we would like to propose an EMIT syntax for SQL, but until then we > > can get more > > valuable feedback from users when they are using the configs. > > > > Regarding the exposed configuration, I prefer proposal (2). > > But it would be better not to expose `table.exec.emit.late-fire.enabled` > > on docs and we can > > remove it in the next version. > > > > Best, > > Jark > > > > > > On Tue, 29 Jun 2021 at 11:09, JING ZHANG <beyond1...@gmail.com> wrote: > > > >> When WindowAggregate works upon Changelog which contains update > messages, > >> UPDATE BEFORE message may be dropped as a late message. [1] > >> > >> In order to handle late UB message, user needs to set *all* the > >> following 3 parameters: > >> > >> (1) enable late fire by setting > >> > >> table.exec.emit.late-fire.enabled : true > >> > >> (2) set per record emit behavior for late records by setting > >> > >> table.exec.emit.late-fire.delay : 0 s > >> > >> (3) keep window state for extra time after window is fired by setting > >> > >> table.exec.emit.allow-lateness : 1 h// 或者table.exec.state.ttl: 1h > >> > >> > >> The solution has two disadvantages: > >> > >> (1) Users may not realize that UB messages may be dropped as a late > >> event, so they will not set related parameters. > >> > >> (2) When users look for a solution to solve the dropped UB messages > >> problem, the current solution is a bit inconvenient for them because > they > >> need to set all the 3 parameters. Besides, some configurations have > overlap > >> ability. > >> > >> > >> Now there are two proposals to simplify the 3 parameters a little. > >> > >> (1) Users only need set table.exec.emit.allow-lateness (just like the > >> behavior on Datastream, user only need set allow-lateness), framework > could > >> atom set `table.exec.emit.late-fire.enabled` to true and set > >> `table.exec.emit.late-fire.delay` to 0s. > >> > >> And in the later version, we deprecate `table.exec.emit.late-fire.delay` > >> and `table.exec.emit.late-fire.enabled`. > >> > >> > >> (2) Users need set `table.exec.emit.late-fire.enabled` to true and set > >> `table.exec.state.ttl`, framework could atom set > >> `table.exec.emit.late-fire.delay` to 0s. > >> > >> And in the later version, we deprecate `table.exec.emit.late-fire.delay` > >> and `table.exec.emit.allow-lateness `. > >> > >> > >> Please let me know what you think about the issue. > >> > >> Thank you. > >> > >> [1] https://issues.apache.org/jira/browse/FLINK-22781 > >> > >> > >> Best regards, > >> JING ZHANG > >> > >> > >> > >> >