Hi Tawfik,

Fast and slow streaming in distributed scenarios leads to watermark
advancing too fast, which leads to lost data and is a headache in Flink.
Can't wait to read your research paper!

Best,
Ron

Yun Tang <myas...@live.com> 于2023年9月6日周三 14:46写道:

> Hi Tawfik,
>
> Thanks for offering such a proposal, looking forward to your research
> paper!
>
> You could also ask the edit permission for Flink improvement proposals to
> create a new proposal if you want to contribute this to the community by
> yourself.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals
>
> Best
> Yun Tang
> ________________________________
> From: yuxia <luoyu...@alumni.sjtu.edu.cn>
> Sent: Wednesday, September 6, 2023 12:31
> To: dev <dev@flink.apache.org>
> Subject: Re: Proposal for Implementing Keyed Watermarks in Apache Flink
>
> Hi, Tawfik Yasser.
> Thanks for the proposal.
> It sounds exciting. I can't wait the research paper for more details.
>
> Best regards,
> Yuxia
>
> ----- 原始邮件 -----
> 发件人: "David Morávek" <d...@apache.org>
> 收件人: "dev" <dev@flink.apache.org>
> 发送时间: 星期二, 2023年 9 月 05日 下午 4:36:51
> 主题: Re: Proposal for Implementing Keyed Watermarks in Apache Flink
>
> Hi Tawfik,
>
> It's exciting to see any ongoing research that tries to push Flink forward!
>
> The get the discussion started, can you please your paper with the
> community? Assessing the proposal without further context is tough.
>
> Best,
> D.
>
> On Mon, Sep 4, 2023 at 4:42 PM Tawfek Yasser Tawfek <tyas...@nu.edu.eg>
> wrote:
>
> > Dear Apache Flink Development Team,
> >
> > I hope this email finds you well. I am writing to propose an exciting new
> > feature for Apache Flink that has the potential to significantly enhance
> > its capabilities in handling unbounded streams of events, particularly in
> > the context of event-time windowing.
> >
> > As you may be aware, Apache Flink has been at the forefront of Big Data
> > Stream processing engines, leveraging windowing techniques to manage
> > unbounded event streams effectively. The accuracy of the results obtained
> > from these streams relies heavily on the ability to gather all relevant
> > input within a window. At the core of this process are watermarks, which
> > serve as unique timestamps marking the progression of events in time.
> >
> > However, our analysis has revealed a critical issue with the current
> > watermark generation method in Apache Flink. This method, which operates
> at
> > the input stream level, exhibits a bias towards faster sub-streams,
> > resulting in the unfortunate consequence of dropped events from slower
> > sub-streams. Our investigations showed that Apache Flink's conventional
> > watermark generation approach led to an alarming data loss of
> approximately
> > 33% when 50% of the keys around the median experienced delays. This loss
> > further escalated to over 37% when 50% of random keys were delayed.
> >
> > In response to this issue, we have authored a research paper outlining a
> > novel strategy named "keyed watermarks" to address data loss and
> > substantially enhance data processing accuracy, achieving at least 99%
> > accuracy in most scenarios.
> >
> > Moreover, we have conducted comprehensive comparative studies to evaluate
> > the effectiveness of our strategy against the conventional watermark
> > generation method, specifically in terms of event-time tracking accuracy.
> >
> > We believe that implementing keyed watermarks in Apache Flink can greatly
> > enhance its performance and reliability, making it an even more valuable
> > tool for organizations dealing with complex, high-throughput data
> > processing tasks.
> >
> > We kindly request your consideration of this proposal. We would be eager
> > to discuss further details, provide the full research paper, or
> collaborate
> > closely to facilitate the integration of this feature into Apache Flink.
> >
> > Thank you for your time and attention to this proposal. We look forward
> to
> > the opportunity to contribute to the continued success and evolution of
> > Apache Flink.
> >
> > Best Regards,
> >
> > Tawfik Yasser
> > Senior Teaching Assistant @ Nile University, Egypt
> > Email: tyas...@nu.edu.eg
> > LinkedIn: https://www.linkedin.com/in/tawfikyasser/
> >
>

Reply via email to