Hi Yuxin,

+1 for this proposal.
This change will greatly alleviate the pressure on local storage resources
(especially when there is limited local storage)
particularly in the context of cloud-native environments.

Regards,
Jeyhun

On Thu, Jun 6, 2024 at 1:20 PM Yuxin Tan <tanyuxinw...@gmail.com> wrote:

> Hi all,
>
> Thanks for all the feedback and suggestions so far.
>
> If there is no further comment, we will open the voting thread tomorrow.
>
> Best,
> Yuxin
>
>
> Yuxin Tan <tanyuxinw...@gmail.com> 于2024年6月6日周四 15:40写道:
>
> > Thanks Zhu for the suggestion.
> > I have updated the description of the option.
> >
> > Best,
> > Yuxin
> >
> >
> > Zhu Zhu <reed...@gmail.com> 于2024年6月6日周四 14:59写道:
> >
> >> +1
> >>
> >> Maybe explain in the description of
> >> `taskmanager.network.hybrid-shuffle.external-remote-tier-factory.class`
> >> that it only accepts Celeborn as the remote shuffle tier at this moment?
> >>
> >> Thanks,
> >> Zhu
> >>
> >> Junrui Lee <jrlee....@gmail.com> 于2024年6月6日周四 13:49写道:
> >>
> >> > Thanks Yuxin for your answer. +1 for this proposal.
> >> >
> >> > Best,
> >> > Junrui.
> >> >
> >> > Yuxin Tan <tanyuxinw...@gmail.com> 于2024年6月6日周四 13:42写道:
> >> >
> >> > > Thanks Junrui for your question.
> >> > >
> >> > > > I wonder if the current interface design support the
> >> > > future adaptation for batch job recovery
> >> > >
> >> > > I noticed that FLIP-383 supports batch job recovery by introducing
> >> > > some new APIs. These APIs can also be added to the Tier-related
> >> > > interfaces to facilitate the feature. Since these modifications are
> >> not
> >> > > directly related to the current integration tasks and the
> integration
> >> > > does not conflict with the batch job recovery, I propose that this
> >> FLIP
> >> > > doesn't involve these particular changes. Moreover, considering that
> >> > > the Tier interfaces are not public currently, it is also feasible to
> >> add
> >> > > the interfaces directly if necessary.
> >> > > WDYT?
> >> > >
> >> > > Best,
> >> > > Yuxin
> >> > >
> >> > >
> >> > > Junrui Lee <jrlee....@gmail.com> 于2024年6月6日周四 11:02写道:
> >> > >
> >> > > > Thanks Yuxin for driving this proposal!
> >> > > >
> >> > > > I have a question about the public interface compatibility in the
> >> > context
> >> > > > of FLIP-459. As we've supported batch job recovery from jobMaster
> >> > > failures
> >> > > > in FLIP-383 which will be released in Flink 1.20. I wonder if the
> >> > current
> >> > > > interface design support the future adaptation for batch job
> >> recovery?
> >> > > >
> >> > > > Looking forward to your feedback.
> >> > > >
> >> > > > Best,
> >> > > > Junrui.
> >> > > >
> >> > > > weijie guo <guoweijieres...@gmail.com> 于2024年6月5日周三 10:13写道:
> >> > > >
> >> > > > > Thanks Yuxin for the proposal!
> >> > > > >
> >> > > > > When we first proposed Hybrid Shuffle, I wanted to support
> >> pluggable
> >> > > > > storage tier in the future. However, limited by the architecture
> >> of
> >> > the
> >> > > > > legacy Hybrid Shuffle at that time, this idea has not been
> >> realized.
> >> > > The
> >> > > > > new architecture abstracts the tier nicely, and now it's time to
> >> > > > introduce
> >> > > > > support for external storage.
> >> > > > >
> >> > > > > Big +1 for this one!
> >> > > > >
> >> > > > > Best regards,
> >> > > > >
> >> > > > > Weijie
> >> > > > >
> >> > > > >
> >> > > > > rexxiong <rexxi...@apache.org> 于2024年6月5日周三 00:08写道:
> >> > > > >
> >> > > > > > Thanks Yuxin for the proposal. +1,  as a member of the Apache
> >> > > Celeborn
> >> > > > > > community, I am very excited about the integration of Flink's
> >> > Hybrid
> >> > > > > > Shuffle with Apache Celeborn. The whole design of CIP-6 looks
> >> good
> >> > to
> >> > > > > me. I
> >> > > > > > am looking forward to this integration.
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > Jiashu Xiong
> >> > > > > >
> >> > > > > > Ethan Feng <ethanf...@apache.org> 于2024年6月4日周二 16:47写道:
> >> > > > > >
> >> > > > > > > +1 for this proposal.
> >> > > > > > >
> >> > > > > > > After internally reviewing the prototype of CIP-6, this
> would
> >> > > improve
> >> > > > > > > performance and stability for Flink users using Celeborn.
> >> > > > > > >
> >> > > > > > > Expect to see this feature come out to the community.
> >> > > > > > >
> >> > > > > > > As I come from the Celeborn community, I hope more users can
> >> try
> >> > to
> >> > > > > > > use Celeborn when there are Flink batch jobs.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > Ethan Feng
> >> > > > > > >
> >> > > > > > > Yuxin Tan <tanyuxinw...@gmail.com> 于2024年6月4日周二 16:34写道:
> >> > > > > > > >
> >> > > > > > > > Hi, Venkatakrishnan,
> >> > > > > > > >
> >> > > > > > > > Thanks for joining the discussion. We appreciate your
> >> interest
> >> > > > > > > > in contributing to the work. Once the FLIP and CIP
> proposals
> >> > > > > > > > have been approved, we will create some JIRA tickets in
> >> Flink
> >> > > > > > > > and Celeborn projects. Please feel free to take a look at
> >> the
> >> > > > > > > > tickets and select any that resonate with your interests.
> >> > > > > > > >
> >> > > > > > > > Best,
> >> > > > > > > > Yuxin
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > Venkatakrishnan Sowrirajan <vsowr...@asu.edu>
> 于2024年5月31日周五
> >> > > > 23:11写道:
> >> > > > > > > >
> >> > > > > > > > > Thanks for this FLIP. We are also interested in
> >> > > > > learning/contributing
> >> > > > > > > to
> >> > > > > > > > > the hybrid shuffle integration with celeborn for batch
> >> > > > executions.
> >> > > > > > > > >
> >> > > > > > > > > On Tue, May 28, 2024, 7:07 PM Yuxin Tan <
> >> > > tanyuxinw...@gmail.com>
> >> > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hi, Xintong,
> >> > > > > > > > > >
> >> > > > > > > > > > >  I think we can also publish the prototype codes so
> >> the
> >> > > > > > > > > > community can better understand and help with it.
> >> > > > > > > > > >
> >> > > > > > > > > > Ok, I agree on the point. I will prepare and publish
> the
> >> > code
> >> > > > > > > > > > recently.
> >> > > > > > > > > >
> >> > > > > > > > > > Rui,
> >> > > > > > > > > >
> >> > > > > > > > > > > Kindly reminder: the image of CIP-6[1] cannot be
> >> loaded.
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks for the reminder. I've updated the images.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Best,
> >> > > > > > > > > > Yuxin
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > Rui Fan <1996fan...@gmail.com> 于2024年5月29日周三 09:33写道:
> >> > > > > > > > > >
> >> > > > > > > > > > > Thanks Yuxin for driving this proposal!
> >> > > > > > > > > > >
> >> > > > > > > > > > > Kindly reminder: the image of CIP-6[1] cannot be
> >> loaded.
> >> > > > > > > > > > >
> >> > > > > > > > > > > [1]
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/CELEBORN/CIP-6*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;KysrKysrKys!!IKRxdwAv5BmarQ!ZRTc1aUSYMDBazuIwlet1Dzk2_DD9qKTgoDLH9jSwAVLgwplcuId_8JoXkH0i7AeWxKWXkL0sxM3AeW-H9OJ6v9uGw$
> >> > > > > > > > > > >
> >> > > > > > > > > > > Best,
> >> > > > > > > > > > > Rui
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Wed, May 29, 2024 at 9:03 AM Xintong Song <
> >> > > > > > > tonysong...@gmail.com>
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > +1 for this proposal.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > We have been prototyping this feature internally
> at
> >> > > Alibaba
> >> > > > > > for a
> >> > > > > > > > > > couple
> >> > > > > > > > > > > of
> >> > > > > > > > > > > > months. Yuxin, I think we can also publish the
> >> > prototype
> >> > > > > codes
> >> > > > > > > so the
> >> > > > > > > > > > > > community can better understand and help with it.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Best,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Xintong
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Tue, May 28, 2024 at 8:34 PM Yuxin Tan <
> >> > > > > > > tanyuxinw...@gmail.com>
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Hi all,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > I would like to start a discussion on FLIP-459
> >> > Support
> >> > > > > Flink
> >> > > > > > > hybrid
> >> > > > > > > > > > > > shuffle
> >> > > > > > > > > > > > > integration with
> >> > > > > > > > > > > > > Apache Celeborn[1]. Flink hybrid shuffle
> supports
> >> > > > > transitions
> >> > > > > > > > > between
> >> > > > > > > > > > > > > memory, disk, and
> >> > > > > > > > > > > > > remote storage to improve performance and job
> >> > > stability.
> >> > > > > > > > > > Concurrently,
> >> > > > > > > > > > > > > Apache Celeborn
> >> > > > > > > > > > > > > provides a stable, performant, scalable remote
> >> > shuffle
> >> > > > > > service.
> >> > > > > > > > > This
> >> > > > > > > > > > > > > integration proposal is to
> >> > > > > > > > > > > > > harness the benefits from both hybrid shuffle
> and
> >> > > > Celeborn
> >> > > > > > > > > > > > simultaneously.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Note that this proposal has two parts.
> >> > > > > > > > > > > > > 1. The Flink-side modifications are in
> >> FLIP-459[1].
> >> > > > > > > > > > > > > 2. The Celeborn-side changes are in CIP-6[2].
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Looking forward to everyone's feedback and
> >> > suggestions.
> >> > > > > Thank
> >> > > > > > > you!
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > [1]
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-459*3A*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;JSsrKysrKysr!!IKRxdwAv5BmarQ!ZRTc1aUSYMDBazuIwlet1Dzk2_DD9qKTgoDLH9jSwAVLgwplcuId_8JoXkH0i7AeWxKWXkL0sxM3AeW-H9MaOGE7hQ$
> >> > > > > > > > > > > > > [2]
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/CELEBORN/CIP-6*Support*Flink*hybrid*shuffle*integration*with*Apache*Celeborn__;KysrKysrKys!!IKRxdwAv5BmarQ!ZRTc1aUSYMDBazuIwlet1Dzk2_DD9qKTgoDLH9jSwAVLgwplcuId_8JoXkH0i7AeWxKWXkL0sxM3AeW-H9OJ6v9uGw$
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Best,
> >> > > > > > > > > > > > > Yuxin
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to