Hello XU,Happy new year. Thank you for FLIP-499 and FLIP-467.
I tried to split/chunk streams based by fixed timestamp intervals and route 
them to the appropriate destination. A few months ago, I evaluated the 
following options and found that Flink currently lacks direct support for a 
global watermark or timer that can share consistent information (such as an 
epoch or identifier) across task nodes.
1. Windowing: While promising, this approach requires record-level checks for 
flushing, as window data isn't accessible throughout the pipeline.
2. Window + Trigger: This method buffers events until the trigger interval is 
reached, impacting real-time processing since events are processed only when 
the trigger fires.
3. Processing Time: Processing time is localized to each file writer, causing 
inconsistencies across task managers.
4. Watermark: Watermarks are specific to each source task. Additionally, the 
initial watermark (before the first event) is not epoch-based, leading to 
further challenges.
Would global watermarks address this use case? If not, could this use case 
align with any of the proposed FLIPs
Thanks in advance.

    On Thursday, January 2, 2025 at 09:06:31 PM PST, Xu Huang 
<huangxu.wal...@gmail.com> wrote:  
 
 Hi Devs,

Weijie Guo and I would like to initiate a discussion about FLIP-499:
Support Event Time by Generalized Watermark in DataStream V2
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-499%3A+Support+Event+Time+by+Generalized+Watermark+in+DataStream+V2>
[1].

Event time is a fundamental feature of Flink that has been widely adopted.
For instance, the Window operator can determine whether to trigger a window
based on event time, and users can register timer using the event time.
FLIP-467[2] introduces the Generalized Watermark in DataStream V2, enabling
users to define specific events that can be emitted from a source or other
operators, propagate along the streams, received by downstream operators,
and aligned during propagation. Within this framework, the traditional
(event-time) Watermark can be viewed as a special instance of the
Generalized Watermark already provided by Flink.

To make it easy for users to use event time in DataStream V2, this FLIP
will implement event time extension in DataStream V2 based on Generalized
Watermark.

For more details, please refer to FLIP-499 [1]. We look forward to your
feedback.

Best,

Xu Huang

[1] https://cwiki.apache.org/confluence/x/pQz0Ew

[2] https://cwiki.apache.org/confluence/x/oA6TEg
  

Reply via email to