Hello XU,Happy new year. Thank you for FLIP-499 and FLIP-467. I tried to split/chunk streams based by fixed timestamp intervals and route them to the appropriate destination. A few months ago, I evaluated the following options and found that Flink currently lacks direct support for a global watermark or timer that can share consistent information (such as an epoch or identifier) across task nodes. 1. Windowing: While promising, this approach requires record-level checks for flushing, as window data isn't accessible throughout the pipeline. 2. Window + Trigger: This method buffers events until the trigger interval is reached, impacting real-time processing since events are processed only when the trigger fires. 3. Processing Time: Processing time is localized to each file writer, causing inconsistencies across task managers. 4. Watermark: Watermarks are specific to each source task. Additionally, the initial watermark (before the first event) is not epoch-based, leading to further challenges. Would global watermarks address this use case? If not, could this use case align with any of the proposed FLIPs Thanks in advance.
On Thursday, January 2, 2025 at 09:06:31 PM PST, Xu Huang <huangxu.wal...@gmail.com> wrote: Hi Devs, Weijie Guo and I would like to initiate a discussion about FLIP-499: Support Event Time by Generalized Watermark in DataStream V2 <https://cwiki.apache.org/confluence/display/FLINK/FLIP-499%3A+Support+Event+Time+by+Generalized+Watermark+in+DataStream+V2> [1]. Event time is a fundamental feature of Flink that has been widely adopted. For instance, the Window operator can determine whether to trigger a window based on event time, and users can register timer using the event time. FLIP-467[2] introduces the Generalized Watermark in DataStream V2, enabling users to define specific events that can be emitted from a source or other operators, propagate along the streams, received by downstream operators, and aligned during propagation. Within this framework, the traditional (event-time) Watermark can be viewed as a special instance of the Generalized Watermark already provided by Flink. To make it easy for users to use event time in DataStream V2, this FLIP will implement event time extension in DataStream V2 based on Generalized Watermark. For more details, please refer to FLIP-499 [1]. We look forward to your feedback. Best, Xu Huang [1] https://cwiki.apache.org/confluence/x/pQz0Ew [2] https://cwiki.apache.org/confluence/x/oA6TEg