Hi, Junrui

Thanks for your question, Source is able to send event-time watermark.
It can be sent via SourceReaderContext as mentioned in FLIP-467. Example is
as follows: >
sourceReaderContext.emitWatermark(EventTimeExtension.EVENT_TIME_WATERMARK_DECLARATION.newWatermark(eventTime))

We will add it to the JavaDoc or FLIP to clarify this, thanks for the
question!

Best,
Xu Huang

Junrui Lee <jrlee....@gmail.com> 于2025年1月6日周一 11:01写道:

> Hi, Xu
>
> Thanks for your work, I noticed that in this FLIP, event-time watermark is
> created and sent through a separate WatermarkGenerator. I would like to
> know if there is support for Source to send event-time watermark?
>
> Best,
> Junrui
>
> Xu Huang <huangxu.wal...@gmail.com> 于2025年1月6日周一 10:31写道:
>
> > Hi, Anil
> >
> > Maybe the watermark alignment mechanism on DataStream V1 can solve your
> > problem, it can align the watermark of SourceSplit.
> > please refer to the documentation [1][2].
> > And we don't provide this feature on this FLIP, this feature will be in
> our
> > future planning.
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-217%3A+Support+watermark+alignment+of+source+splits
> > [2]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.20/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment
> >
> > Best,
> > Xu Huang
> >
> > Anil Dasari <dasaria...@myyahoo.com.invalid> 于2025年1月3日周五 22:41写道:
> >
> > >
> > > Hi Xu,Thanks for the response.I am currently using Spark Streaming to
> > > process data from Kafka in microbatches, writing each microbatch's data
> > to
> > > a dedicated prefix in S3. Since Spark Streaming is lazy, it processes
> > data
> > > only when a microbatch is created or triggered, leaving resources idle
> > > until then. This approach is not true streaming. To improve resource
> > > utilization and process data as it arrives, I am considering switching
> to
> > > Flink.
> > > Both Spark's Kafka source and Flink's Kafka source (with parallelism >
> 1)
> > > use multithreaded processing. However, Spark's Kafka source readers
> > share a
> > > global microbatch epoch, ensuring a consistent view across readers. In
> > > contrast, Flink's Kafka source split readers do not share a global
> epoch
> > or
> > > identifiers to divide the stream into chunks.
> > > It requires all split readers of a source should emit a special event
> > > which same epoch at the same time.
> > > Thanks
> > >
> > >     On Friday, January 3, 2025 at 06:21:34 AM PST, Xu Huang <
> > > huangxu.wal...@gmail.com> wrote:
> > >
> > >  Hi, Anil
> > >
> > > I don't understand what you mean by Global Watermark, are you trying to
> > > have all Sources emit a special event with the same epoch at the same
> > time?
> > > Is there a specific user case for this question?
> > >
> > > Happy new year!
> > >
> > > Best,
> > > Xu Huang
> > >
> > > Anil Dasari <dasaria...@myyahoo.com.invalid> 于2025年1月3日周五 14:02写道:
> > >
> > > >  Hello XU,Happy new year. Thank you for FLIP-499 and FLIP-467.
> > > > I tried to split/chunk streams based by fixed timestamp intervals and
> > > > route them to the appropriate destination. A few months ago, I
> > evaluated
> > > > the following options and found that Flink currently lacks direct
> > support
> > > > for a global watermark or timer that can share consistent information
> > > (such
> > > > as an epoch or identifier) across task nodes.
> > > > 1. Windowing: While promising, this approach requires record-level
> > checks
> > > > for flushing, as window data isn't accessible throughout the
> pipeline.
> > > > 2. Window + Trigger: This method buffers events until the trigger
> > > interval
> > > > is reached, impacting real-time processing since events are processed
> > > only
> > > > when the trigger fires.
> > > > 3. Processing Time: Processing time is localized to each file writer,
> > > > causing inconsistencies across task managers.
> > > > 4. Watermark: Watermarks are specific to each source task.
> > Additionally,
> > > > the initial watermark (before the first event) is not epoch-based,
> > > leading
> > > > to further challenges.
> > > > Would global watermarks address this use case? If not, could this use
> > > case
> > > > align with any of the proposed FLIPs
> > > > Thanks in advance.
> > > >
> > > >    On Thursday, January 2, 2025 at 09:06:31 PM PST, Xu Huang <
> > > > huangxu.wal...@gmail.com> wrote:
> > > >
> > > >  Hi Devs,
> > > >
> > > > Weijie Guo and I would like to initiate a discussion about FLIP-499:
> > > > Support Event Time by Generalized Watermark in DataStream V2
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-499%3A+Support+Event+Time+by+Generalized+Watermark+in+DataStream+V2
> > > > >
> > > > [1].
> > > >
> > > > Event time is a fundamental feature of Flink that has been widely
> > > adopted.
> > > > For instance, the Window operator can determine whether to trigger a
> > > window
> > > > based on event time, and users can register timer using the event
> time.
> > > > FLIP-467[2] introduces the Generalized Watermark in DataStream V2,
> > > enabling
> > > > users to define specific events that can be emitted from a source or
> > > other
> > > > operators, propagate along the streams, received by downstream
> > operators,
> > > > and aligned during propagation. Within this framework, the
> traditional
> > > > (event-time) Watermark can be viewed as a special instance of the
> > > > Generalized Watermark already provided by Flink.
> > > >
> > > > To make it easy for users to use event time in DataStream V2, this
> FLIP
> > > > will implement event time extension in DataStream V2 based on
> > Generalized
> > > > Watermark.
> > > >
> > > > For more details, please refer to FLIP-499 [1]. We look forward to
> your
> > > > feedback.
> > > >
> > > > Best,
> > > >
> > > > Xu Huang
> > > >
> > > > [1] https://cwiki.apache.org/confluence/x/pQz0Ew
> > > >
> > > > [2] https://cwiki.apache.org/confluence/x/oA6TEg
> > > >
> > >
> >
>

Reply via email to