Thanks Junrui for your question.

> I wonder if the current interface design support the
future adaptation for batch job recovery

I noticed that FLIP-383 supports batch job recovery by introducing
some new APIs. These APIs can also be added to the Tier-related
interfaces to facilitate the feature. Since these modifications are not
directly related to the current integration tasks and the integration
does not conflict with the batch job recovery, I propose that this FLIP
doesn't involve these particular changes. Moreover, considering that
the Tier interfaces are not public currently, it is also feasible to add
the interfaces directly if necessary.
WDYT?

Best,
Yuxin


Junrui Lee <jrlee....@gmail.com> 于2024年6月6日周四 11:02写道:

> Thanks Yuxin for driving this proposal!
>
> I have a question about the public interface compatibility in the context
> of FLIP-459. As we've supported batch job recovery from jobMaster failures
> in FLIP-383 which will be released in Flink 1.20. I wonder if the current
> interface design support the future adaptation for batch job recovery?
>
> Looking forward to your feedback.
>
> Best,
> Junrui.
>
> weijie guo <guoweijieres...@gmail.com> 于2024年6月5日周三 10:13写道:
>
> > Thanks Yuxin for the proposal!
> >
> > When we first proposed Hybrid Shuffle, I wanted to support pluggable
> > storage tier in the future. However, limited by the architecture of the
> > legacy Hybrid Shuffle at that time, this idea has not been realized. The
> > new architecture abstracts the tier nicely, and now it's time to
> introduce
> > support for external storage.
> >
> > Big +1 for this one!
> >
> > Best regards,
> >
> > Weijie
> >
> >
> > rexxiong <rexxi...@apache.org> 于2024年6月5日周三 00:08写道:
> >
> > > Thanks Yuxin for the proposal. +1,  as a member of the Apache Celeborn
> > > community, I am very excited about the integration of Flink's Hybrid
> > > Shuffle with Apache Celeborn. The whole design of CIP-6 looks good to
> > me. I
> > > am looking forward to this integration.
> > >
> > > Thanks,
> > > Jiashu Xiong
> > >
> > > Ethan Feng <ethanf...@apache.org> 于2024年6月4日周二 16:47写道:
> > >
> > > > +1 for this proposal.
> > > >
> > > > After internally reviewing the prototype of CIP-6, this would improve
> > > > performance and stability for Flink users using Celeborn.
> > > >
> > > > Expect to see this feature come out to the community.
> > > >
> > > > As I come from the Celeborn community, I hope more users can try to
> > > > use Celeborn when there are Flink batch jobs.
> > > >
> > > > Thanks,
> > > > Ethan Feng
> > > >
> > > > Yuxin Tan <tanyuxinw...@gmail.com> 于2024年6月4日周二 16:34写道:
> > > > >
> > > > > Hi, Venkatakrishnan,
> > > > >
> > > > > Thanks for joining the discussion. We appreciate your interest
> > > > > in contributing to the work. Once the FLIP and CIP proposals
> > > > > have been approved, we will create some JIRA tickets in Flink
> > > > > and Celeborn projects. Please feel free to take a look at the
> > > > > tickets and select any that resonate with your interests.
> > > > >
> > > > > Best,
> > > > > Yuxin
> > > > >
> > > > >
> > > > > Venkatakrishnan Sowrirajan <vsowr...@asu.edu> 于2024年5月31日周五
> 23:11写道:
> > > > >
> > > > > > Thanks for this FLIP. We are also interested in
> > learning/contributing
> > > > to
> > > > > > the hybrid shuffle integration with celeborn for batch
> executions.
> > > > > >
> > > > > > On Tue, May 28, 2024, 7:07 PM Yuxin Tan <tanyuxinw...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Hi, Xintong,
> > > > > > >
> > > > > > > >  I think we can also publish the prototype codes so the
> > > > > > > community can better understand and help with it.
> > > > > > >
> > > > > > > Ok, I agree on the point. I will prepare and publish the code
> > > > > > > recently.
> > > > > > >
> > > > > > > Rui,
> > > > > > >
> > > > > > > > Kindly reminder: the image of CIP-6[1] cannot be loaded.
> > > > > > >
> > > > > > > Thanks for the reminder. I've updated the images.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yuxin
> > > > > > >
> > > > > > >
> > > > > > > Rui Fan <1996fan...@gmail.com> 于2024年5月29日周三 09:33写道:
> > > > > > >
> > > > > > > > Thanks Yuxin for driving this proposal!
> > > > > > > >
> > > > > > > > Kindly reminder: the image of CIP-6[1] cannot be loaded.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/CELEBORN/CIP-6*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;KysrKysrKys!!IKRxdwAv5BmarQ!ZRTc1aUSYMDBazuIwlet1Dzk2_DD9qKTgoDLH9jSwAVLgwplcuId_8JoXkH0i7AeWxKWXkL0sxM3AeW-H9OJ6v9uGw$
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Rui
> > > > > > > >
> > > > > > > > On Wed, May 29, 2024 at 9:03 AM Xintong Song <
> > > > tonysong...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1 for this proposal.
> > > > > > > > >
> > > > > > > > > We have been prototyping this feature internally at Alibaba
> > > for a
> > > > > > > couple
> > > > > > > > of
> > > > > > > > > months. Yuxin, I think we can also publish the prototype
> > codes
> > > > so the
> > > > > > > > > community can better understand and help with it.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > >
> > > > > > > > > Xintong
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, May 28, 2024 at 8:34 PM Yuxin Tan <
> > > > tanyuxinw...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I would like to start a discussion on FLIP-459 Support
> > Flink
> > > > hybrid
> > > > > > > > > shuffle
> > > > > > > > > > integration with
> > > > > > > > > > Apache Celeborn[1]. Flink hybrid shuffle supports
> > transitions
> > > > > > between
> > > > > > > > > > memory, disk, and
> > > > > > > > > > remote storage to improve performance and job stability.
> > > > > > > Concurrently,
> > > > > > > > > > Apache Celeborn
> > > > > > > > > > provides a stable, performant, scalable remote shuffle
> > > service.
> > > > > > This
> > > > > > > > > > integration proposal is to
> > > > > > > > > > harness the benefits from both hybrid shuffle and
> Celeborn
> > > > > > > > > simultaneously.
> > > > > > > > > >
> > > > > > > > > > Note that this proposal has two parts.
> > > > > > > > > > 1. The Flink-side modifications are in FLIP-459[1].
> > > > > > > > > > 2. The Celeborn-side changes are in CIP-6[2].
> > > > > > > > > >
> > > > > > > > > > Looking forward to everyone's feedback and suggestions.
> > Thank
> > > > you!
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-459*3A*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;JSsrKysrKysr!!IKRxdwAv5BmarQ!ZRTc1aUSYMDBazuIwlet1Dzk2_DD9qKTgoDLH9jSwAVLgwplcuId_8JoXkH0i7AeWxKWXkL0sxM3AeW-H9MaOGE7hQ$
> > > > > > > > > > [2]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/CELEBORN/CIP-6*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;KysrKysrKys!!IKRxdwAv5BmarQ!ZRTc1aUSYMDBazuIwlet1Dzk2_DD9qKTgoDLH9jSwAVLgwplcuId_8JoXkH0i7AeWxKWXkL0sxM3AeW-H9OJ6v9uGw$
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yuxin
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> >
>

Reply via email to