Thanks for your clarification. I have nothing else to add to the
discussion. +1 from my side to proceed

On Wed, Mar 8, 2023 at 4:16 AM Weihua Hu <huweihua....@gmail.com> wrote:

> Thanks Yangze for your attention, this would be a great help.
>
> And thanks Matthias too.
>
> FLIP-156 [1] mentions some incompatibility between fine-grained resource
> > management and reactive mode. I assume that this is independent of the
> > SlotManager and replacing the DSM with the FGSM wouldn't affect reactive
> > mode?
>
> Yes. This incompatibility is independent of SlotManager. That means the
> AdpativeScheduler will always ignore the resource requirement set by
> slotSharingGroup and declare Unknown ResourceProfile to SlotManager.
> So, using FGSM as default will not affect reactive mode.
>
> About the heterogeneous TaskManager: This is a feature that's also not
> > supported in the DSM right now, is it? We should state that fact in the
> > FLIP if we mentioned that we don't want to implement it for the FSGM.
>
> Yes, both DSM and FGSM do not support request heterogeneous
> TaskManager right now. Heterogeneous will make the resource allocation
> logic more complicated, such as the resource deadlock if request A
> allocated
> the bigger slot B and then request B could not allocate the small slot A.
> We
> need to think more before starting to support the heterogeneous task
> manager.
> So, we don't want to implement heterogeneity in this FLIP.
>
> Best,
> Weihua
>
>
> On Wed, Mar 8, 2023 at 12:44 AM Matthias Pohl
> <matthias.p...@aiven.io.invalid> wrote:
>
> > Thanks for updating the FLIP and adding more context to it. Additionally,
> > thanks to Xintong and Yangze for offering your expertise here as
> > contributors to the initial FineGrainedSlotManager implementation.
> >
> > The remark on cutting out functionality was only based on some
> superficial
> > initial code reading. I cannot come up with a better code structure
> myself.
> > Therefore, I'm fine with not refactoring the code as part of this FLIP.
> >
> > The strategies that were proposed around making sure that the refactoring
> > is properly backed by tests sound reasonable. My initial concern was
> based
> > on the fact that we might have unit test scenarios for the DSM that are
> not
> > covered in the unit tests of the FSGM. In that case, swapping the DSM
> with
> > the FSGM might not be good enough. Going over the GSM tests to make sure
> > that we're not accidentally deleting test scenarios sounds good to me.
> > Thanks, Weihua.
> >
> > FLIP-156 [1] mentions some incompatibility between fine-grained resource
> > management and reactive mode. I assume that this is independent of the
> > SlotManager and replacing the DSM with the FGSM wouldn't affect reactive
> > mode?
> >
> > About the heterogeneous TaskManager: This is a feature that's also not
> > supported in the DSM right now, is it? We should state that fact in the
> > FLIP if we mentioned that we don't want to implement it for the FSGM.
> >
> > Best,
> > Matthias
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-156%3A+Runtime+Interfaces+for+Fine-Grained+Resource+Requirements
> >
> > On Tue, Mar 7, 2023 at 8:58 AM Yangze Guo <karma...@gmail.com> wrote:
> >
> > > Hi Weihua,
> > >
> > > Thanks for driving this. As Xintong mentioned, this was a technical
> > > debt from FLIP-56.
> > >
> > > The latest version of FLIP sounds good, +1 from my side. As a
> > > contributor to this component, I'm willing to assist with the review
> > > process. Feel free to reach me if you need help.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Tue, Mar 7, 2023 at 1:47 PM Weihua Hu <huweihua....@gmail.com>
> wrote:
> > > >
> > > > Hi,
> > > >
> > > > @David @Matthias
> > > > There are a few days after hearing your thoughts. I would like to
> know
> > if
> > > > there are any other concerns about this FLIP.
> > > >
> > > >
> > > > Best,
> > > > Weihua
> > > >
> > > >
> > > > On Mon, Mar 6, 2023 at 7:53 PM Weihua Hu <huweihua....@gmail.com>
> > wrote:
> > > >
> > > > >
> > > > > Thanks Shammon,
> > > > >
> > > > > I've updated FLIP to add this redundant Task Manager limitation.
> > > > >
> > > > >
> > > > > Best,
> > > > > Weihua
> > > > >
> > > > >
> > > > > On Mon, Mar 6, 2023 at 5:07 PM Shammon FY <zjur...@gmail.com>
> wrote:
> > > > >
> > > > >> Hi weihua
> > > > >>
> > > > >> Can you add content related to `heterogeneous resources` to this
> > > FLIP? We
> > > > >> can record it and consider it in the future. It may be useful for
> > some
> > > > >> scenarios, such as the combination of streaming and ML.
> > > > >>
> > > > >> Best,
> > > > >> Shammon
> > > > >>
> > > > >>
> > > > >> On Mon, Mar 6, 2023 at 1:39 PM weijie guo <
> > guoweijieres...@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi Weihua,
> > > > >> >
> > > > >> > Thanks for your clarification, SGTM.
> > > > >> >
> > > > >> > Best regards,
> > > > >> >
> > > > >> > Weijie
> > > > >> >
> > > > >> >
> > > > >> > Weihua Hu <huweihua....@gmail.com> 于2023年3月6日周一 11:43写道:
> > > > >> >
> > > > >> > > Thanks Weijie.
> > > > >> > >
> > > > >> > > Heterogeneous task managers will not be considered in this
> FLIP
> > > since
> > > > >> > > it does not request heterogeneous resources as you said.
> > > > >> > >
> > > > >> > > My first thought is we can adjust the meaning of redundant
> > > > >> configuration
> > > > >> > > to redundant number of per resource type. These can be
> > considered
> > > in
> > > > >> > > detail when we decide to support heterogeneous task managers.
> > > > >> > >
> > > > >> > > Best,
> > > > >> > > Weihua
> > > > >> > >
> > > > >> > >
> > > > >> > > On Sat, Mar 4, 2023 at 1:13 AM weijie guo <
> > > guoweijieres...@gmail.com>
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Thanks Weihua for preparing this FLIP.
> > > > >> > > >
> > > > >> > > > This FLIP overall looks reasonable to me after updating as
> > > > >> suggested by
> > > > >> > > > Matthias.
> > > > >> > > >
> > > > >> > > > I only have one small question about keeping some redundant
> > task
> > > > >> > > managers:
> > > > >> > > > In the fine-grained resource management, theoretically, it
> can
> > > > >> support
> > > > >> > > > heterogeneous taskmanagers. When we complete the missing
> > > features
> > > > >> for
> > > > >> > > FGSM,
> > > > >> > > > do we plan to take this into account?
> > > > >> > > > Of course, if I remember correctly, FGSM will not request
> > > > >> heterogeneous
> > > > >> > > > resources at present, so it is also acceptable to me if
> there
> > > is no
> > > > >> > > special
> > > > >> > > > treatment now.
> > > > >> > > >
> > > > >> > > > +1 for this changes if we can ensure the test coverage.
> > > > >> > > >
> > > > >> > > > Best regards,
> > > > >> > > >
> > > > >> > > > Weijie
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > John Roesler <vvcep...@apache.org> 于2023年3月2日周四 12:53写道:
> > > > >> > > >
> > > > >> > > > > Thanks for the test plan, Weihua!
> > > > >> > > > >
> > > > >> > > > > Yes, it addresses my concerns.
> > > > >> > > > >
> > > > >> > > > > Thanks,
> > > > >> > > > > John
> > > > >> > > > >
> > > > >> > > > > On Wed, Mar 1, 2023, at 22:38, Weihua Hu wrote:
> > > > >> > > > > > Hi, everyone,
> > > > >> > > > > > Thanks for your suggestions and ideas.
> > > > >> > > > > > Thanks Xintong for sharing the detailed backgrounds of
> > > > >> SlotManager.
> > > > >> > > > > >
> > > > >> > > > > > *@Matthias
> > > > >> > > > > >
> > > > >> > > > > > 1. Did you do a proper test coverage analysis?
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > Just as Xintong said, we already have a CI stage for
> fine
> > > > >> grained
> > > > >> > > > > resource
> > > > >> > > > > > managers.
> > > > >> > > > > > And I will make sure FineGrainedSlotManager as the
> default
> > > > >> > > SlotManager
> > > > >> > > > > can
> > > > >> > > > > > pass all the tests of CI.
> > > > >> > > > > > In addition, I will review all unit tests of
> > > > >> > > > DeclarativeSlotManager(DSM)
> > > > >> > > > > to
> > > > >> > > > > > ensure that there are no gaps in the
> > > > >> > > > > > coverage provided by the FineGrainedSlotManager.
> > > > >> > > > > > I also added the 'Test Plan' part to the FLIP.
> > > > >> > > > > > @Matthias @John @Shammon Does this test plan address
> your
> > > > >> concerns?
> > > > >> > > > > >
> > > > >> > > > > > 2.  DeclarativeSlotManager and FineGrainedSlotManager
> feel
> > > quite
> > > > >> > big
> > > > >> > > in
> > > > >> > > > > >
> > > > >> > > > > > terms of lines of code....
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > IMO, the refactoring of SlotManager does not belong to
> > this
> > > FLIP
> > > > >> > > since
> > > > >> > > > it
> > > > >> > > > > > may lead to some unstable risks. For
> > > > >> > > > > > FineGrainedSlotManager(FGSM), we already split some
> > > reasonable
> > > > >> > > > > components.
> > > > >> > > > > > They are:
> > > > >> > > > > > * TaskManagerTracker: Track task managers and their
> > > resources.
> > > > >> > > > > > * ResourceTracker: track requirements of jobs
> > > > >> > > > > > * ResourceAllocationStrategy: Try to fulfill the
> resource
> > > > >> > > requirements
> > > > >> > > > > with
> > > > >> > > > > > available/pending resources.
> > > > >> > > > > > * SlotStatusSyncer: communicate with TaskManager, for
> > > > >> > > > allocating/freeing
> > > > >> > > > > > slot and reconciling the slot status
> > > > >> > > > > > Maybe we can start a discussion about refactoring
> > > SlotManager in
> > > > >> > > > another
> > > > >> > > > > > FLIP if there are some good suggestions.
> > > > >> > > > > > WDYT
> > > > >> > > > > >
> > > > >> > > > > > 3. For me personally, having a more detailed summary
> > > comparing
> > > > >> the
> > > > >> > > > > >> subcomponents of both SlotManager implementations with
> > > where
> > > > >> > > > > >> their functionality matches and where they differ might
> > > help
> > > > >> > > > understand
> > > > >> > > > > the
> > > > >> > > > > >> consequences of the changes proposed in FLIP-298
> > > > >> > > > > >
> > > > >> > > > > > Good suggestion, I have updated the comparison in this
> > FLIP.
> > > > >> > Looking
> > > > >> > > > > > forward to any suggestions/thoughts
> > > > >> > > > > > if they are not described clearly.
> > > > >> > > > > >
> > > > >> > > > > > *@John
> > > > >> > > > > >
> > > > >> > > > > > 4. In addition to changing the default, would it make
> > sense
> > > to
> > > > >> log
> > > > >> > a
> > > > >> > > > > >> deprecation warning on initialization
> > > > >> > > > > >
> > > > >> > > > > > if the DeclarativeSlotManager is used?
> > > > >> > > > > >>
> > > > >> > > > > > SGTM, We should add Deprecated annotations to DSM for
> > devs.
> > > And
> > > > >> > log a
> > > > >> > > > > > deprecation warning for users.
> > > > >> > > > > >
> > > > >> > > > > > *@Shammon
> > > > >> > > > > >
> > > > >> > > > > > 1. For their functional differences, can you give some
> > > detailed
> > > > >> > tests
> > > > >> > > > to
> > > > >> > > > > >> verify that the new FineGrainedSlotManager has these
> > > > >> capabilities?
> > > > >> > > > This
> > > > >> > > > > can
> > > > >> > > > > >> effectively verify the new functions
> > > > >> > > > > >>
> > > > >> > > > > > As just maintained, there is already a CI stage of FGSM,
> > > and I
> > > > >> will
> > > > >> > > do
> > > > >> > > > > more
> > > > >> > > > > > review of unit tests for DSM.
> > > > >> > > > > >
> > > > >> > > > > >  2. I'm worried that many functions are not independent
> > and
> > > it
> > > > >> is
> > > > >> > > > > difficult
> > > > >> > > > > >> to migrate step-by-step. You can list the relationship
> > > between
> > > > >> > them
> > > > >> > > in
> > > > >> > > > > >> detail.
> > > > >> > > > > >
> > > > >> > > > > >  As Xintong saied the DSM is a subset of FGSM by design.
> > > But as
> > > > >> > time
> > > > >> > > > goes
> > > > >> > > > > > on, FGSM has some lacking
> > > > >> > > > > > functions as I listed in this FLIP. And I have added the
> > > > >> comparison
> > > > >> > > > > between
> > > > >> > > > > > DSM and FGSM in this FLIP.
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > Thanks again for all your thoughts. Any feedback is
> > > appreciated!
> > > > >> > > > > >
> > > > >> > > > > > Best,
> > > > >> > > > > > Weihua
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > On Wed, Mar 1, 2023 at 2:17 PM Xintong Song <
> > > > >> tonysong...@gmail.com
> > > > >> > >
> > > > >> > > > > wrote:
> > > > >> > > > > >
> > > > >> > > > > >> Thanks Weihua for preparing this FLIP. +1 for the
> > proposal.
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> As one of the contributors of the fine-grained slot
> > > manager,
> > > > >> I'd
> > > > >> > > like
> > > > >> > > > to
> > > > >> > > > > >> share some backgrounds here.
> > > > >> > > > > >>
> > > > >> > > > > >> - There used to be a defaut slot manager
> implementation,
> > > which
> > > > >> is
> > > > >> > > > > >> non-declarative and has been removed now. The two
> > features,
> > > > >> > > > declarative
> > > > >> > > > > /
> > > > >> > > > > >> reactive resource management and fine-grained resource
> > > > >> management,
> > > > >> > > > were
> > > > >> > > > > >> proposed at about the same time. We were aware that by
> > > design
> > > > >> the
> > > > >> > > > > >> declarative slot manager is a subset of fine-grained
> slot
> > > > >> manager
> > > > >> > at
> > > > >> > > > > that
> > > > >> > > > > >> time, but still decided to implement two slot managers
> > for
> > > the
> > > > >> > > purpose
> > > > >> > > > > of
> > > > >> > > > > >> decoupling efforts and reducing cross-team
> > synchronization
> > > > >> > overhead.
> > > > >> > > > > >> Merging the two slot managers once they are proved
> stable
> > > is
> > > > >> IMO a
> > > > >> > > > > >> technical debt that was planned at the very beginning.
> > > > >> > > > > >>
> > > > >> > > > > >> - The FineGrainedSlotManager has been verified in
> > Alibaba's
> > > > >> > internal
> > > > >> > > > > >> production as well as Alibaba Cloud services as the
> > default
> > > > >> slot
> > > > >> > > > manager
> > > > >> > > > > >> for about 2 years.
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> Concerning test cases, we currently have a ci stage for
> > > fine
> > > > >> > grained
> > > > >> > > > > >> resource management. To avoid adding too much burden,
> the
> > > stage
> > > > >> > only
> > > > >> > > > > >> includes tests from flink-runtime and flink-test
> > modules. I
> > > > >> think
> > > > >> > > > > switching
> > > > >> > > > > >> the default slot manager and applying the whole set of
> > > tests on
> > > > >> > the
> > > > >> > > > > >> fine-grained slot manager would help us to be more
> > > confident
> > > > >> about
> > > > >> > > it.
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> Concerning cutting out functionalities out of slot
> > > manager, I
> > > > >> > think
> > > > >> > > > > Yangze
> > > > >> > > > > >> and I have tried our best to shape the
> > > FineGrainedSlotManager
> > > > >> into
> > > > >> > > > > >> reasonable components. I personally don't have other
> > ideas
> > > to
> > > > >> > > further
> > > > >> > > > > >> disassemble the component, but I'm open to such
> > > suggestions.
> > > > >> > > However,
> > > > >> > > > > from
> > > > >> > > > > >> the stability perspective, I'd be in favor of not
> > > introducing
> > > > >> > > > > significant
> > > > >> > > > > >> changes to the FineGrainedSlotManager while switching
> it
> > > to the
> > > > >> > > > default.
> > > > >> > > > > >> Because the current implementation has already been
> > > verified
> > > > >> (or
> > > > >> > at
> > > > >> > > > > least
> > > > >> > > > > >> partially verified because Alibaba does not cover all
> the
> > > Flink
> > > > >> > use
> > > > >> > > > > cases),
> > > > >> > > > > >> and introducing more changes also means more chances of
> > > > >> breaking
> > > > >> > > > things.
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> Best,
> > > > >> > > > > >>
> > > > >> > > > > >> Xintong
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >>
> > > > >> > > > > >> On Wed, Mar 1, 2023 at 11:12 AM Shammon FY <
> > > zjur...@gmail.com>
> > > > >> > > wrote:
> > > > >> > > > > >>
> > > > >> > > > > >> > Hi
> > > > >> > > > > >> >
> > > > >> > > > > >> > Thanks for starting this work weihua, I think
> unifying
> > > > >> > > > > >> > DeclarativeSlotManager and FineGrainedSlotManager is
> > > > >> valuable.
> > > > >> > > > > >> >
> > > > >> > > > > >> > I agree with @Matthias and @John that we need a way
> to
> > > ensure
> > > > >> > that
> > > > >> > > > > >> > DeclarativeSlotManager's capabilities are fully
> covered
> > > by
> > > > >> > > > > >> > FineGrainedSlotManager
> > > > >> > > > > >> >
> > > > >> > > > > >> > 1. For their functional differences, can you give
> some
> > > > >> detailed
> > > > >> > > > tests
> > > > >> > > > > to
> > > > >> > > > > >> > verify that the new FineGrainedSlotManager has these
> > > > >> > capabilities?
> > > > >> > > > > This
> > > > >> > > > > >> can
> > > > >> > > > > >> > effectively verify the new functions
> > > > >> > > > > >> >
> > > > >> > > > > >> > 2. I'm worried that many functions are not
> independent
> > > and
> > > > >> it is
> > > > >> > > > > >> difficult
> > > > >> > > > > >> > to migrate step-by-step. You can list the
> relationship
> > > > >> between
> > > > >> > > them
> > > > >> > > > in
> > > > >> > > > > >> > detail.
> > > > >> > > > > >> >
> > > > >> > > > > >> > 3. As John mentioned, give a smoke test for
> > > > >> > FineGrainedSlotManager
> > > > >> > > > is
> > > > >> > > > > a
> > > > >> > > > > >> > good idea. Or you can add some test information to
> the
> > > > >> > > > > >> > DeclarativeSlotManager to determine how many tests
> have
> > > used
> > > > >> it.
> > > > >> > > In
> > > > >> > > > > this
> > > > >> > > > > >> > way, we can gradually construct test cases for
> > > > >> > > > FineGrainedSlotManager
> > > > >> > > > > >> > during the development process.
> > > > >> > > > > >> >
> > > > >> > > > > >> >
> > > > >> > > > > >> > Best,
> > > > >> > > > > >> > Shammon
> > > > >> > > > > >> >
> > > > >> > > > > >> >
> > > > >> > > > > >> > On Tue, Feb 28, 2023 at 10:22 PM John Roesler <
> > > > >> > > vvcep...@apache.org>
> > > > >> > > > > >> wrote:
> > > > >> > > > > >> >
> > > > >> > > > > >> > > Thanks for the FLIP, Weihua!
> > > > >> > > > > >> > >
> > > > >> > > > > >> > > I’ve read the FLIP, and it sounds good to me. We
> need
> > > to
> > > > >> avoid
> > > > >> > > > > >> > > proliferating alternative implementations wherever
> > > > >> possible. I
> > > > >> > > > have
> > > > >> > > > > >> just
> > > > >> > > > > >> > a
> > > > >> > > > > >> > > couple of comments:
> > > > >> > > > > >> > >
> > > > >> > > > > >> > > 1. I share Matthias’s concern about ensuring the
> > > behavior
> > > > >> is
> > > > >> > > > really
> > > > >> > > > > the
> > > > >> > > > > >> > > same. One suggestion I’ve used for this kind of
> thing
> > > is,
> > > > >> as a
> > > > >> > > > smoke
> > > > >> > > > > >> > test,
> > > > >> > > > > >> > > to update the DeclarativeSlotManager to just
> delegate
> > > to
> > > > >> the
> > > > >> > > > > >> > > FineGrainedSlotManager. If the full test suite
> still
> > > > >> passes,
> > > > >> > you
> > > > >> > > > > can be
> > > > >> > > > > >> > > pretty sure the new default is really ok. It would
> > not
> > > be a
> > > > >> > good
> > > > >> > > > > idea
> > > > >> > > > > >> to
> > > > >> > > > > >> > > actually keep that in for the release, since it
> would
> > > > >> remove
> > > > >> > the
> > > > >> > > > > option
> > > > >> > > > > >> > to
> > > > >> > > > > >> > > fall back in case of bugs. Either way, we need to
> > make
> > > sure
> > > > >> > all
> > > > >> > > > test
> > > > >> > > > > >> > > scenarios are present for the FGSM.
> > > > >> > > > > >> > >
> > > > >> > > > > >> > > 4. In addition to changing the default, would it
> make
> > > > >> sense to
> > > > >> > > > log a
> > > > >> > > > > >> > > deprecation warning on initialization if the
> > > > >> > > > DeclarativeSlotManager
> > > > >> > > > > is
> > > > >> > > > > >> > used?
> > > > >> > > > > >> > >
> > > > >> > > > > >> > > Thanks again,
> > > > >> > > > > >> > > John
> > > > >> > > > > >> > >
> > > > >> > > > > >> > > On Tue, Feb 28, 2023, at 07:20, Matthias Pohl
> wrote:
> > > > >> > > > > >> > > > Hi Weihua,
> > > > >> > > > > >> > > > Thanks for your proposal. From a conceptual
> point:
> > > AFAIU,
> > > > >> > the
> > > > >> > > > > >> > > > DeclarativeSlotManager covers a subset (i.e. only
> > > evenly
> > > > >> > sized
> > > > >> > > > > slots)
> > > > >> > > > > >> > of
> > > > >> > > > > >> > > > what the FineGrainedSlotManager should be able to
> > > achieve
> > > > >> > > > > (variable
> > > > >> > > > > >> > slot
> > > > >> > > > > >> > > > size per task manager). Is this the right
> > > > >> > > > > assumption/understanding?
> > > > >> > > > > >> In
> > > > >> > > > > >> > > this
> > > > >> > > > > >> > > > sense, merging both implementations into a single
> > one
> > > > >> sounds
> > > > >> > > > > good. A
> > > > >> > > > > >> > few
> > > > >> > > > > >> > > > more general comments, though:
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > > 1. Did you do a proper test coverage analysis?
> > > That's not
> > > > >> > > > > mentioned
> > > > >> > > > > >> in
> > > > >> > > > > >> > > the
> > > > >> > > > > >> > > > current version of the FLIP. I'm bringing this up
> > > > >> because we
> > > > >> > > ran
> > > > >> > > > > into
> > > > >> > > > > >> > the
> > > > >> > > > > >> > > > same issue when fixing the flaws that popped up
> > after
> > > > >> > > > introducing
> > > > >> > > > > the
> > > > >> > > > > >> > > > multi-component leader election (see FLIP-285
> [1]).
> > > There
> > > > >> > is a
> > > > >> > > > > risk
> > > > >> > > > > >> > that
> > > > >> > > > > >> > > by
> > > > >> > > > > >> > > > removing the legacy code we decrease test
> coverage
> > > > >> because
> > > > >> > > > certain
> > > > >> > > > > >> > > > test cases that were covered for the legacy
> classes
> > > might
> > > > >> > not
> > > > >> > > be
> > > > >> > > > > >> > > > necessarily covered in the new implementation,
> yet
> > > (see
> > > > >> > > > > FLINK-30338
> > > > >> > > > > >> [2]
> > > > >> > > > > >> > > > which covers this issue for the leader election
> > > case).
> > > > >> > > Ideally,
> > > > >> > > > we
> > > > >> > > > > >> > don't
> > > > >> > > > > >> > > > want to remove test cases accidentally because
> they
> > > were
> > > > >> > only
> > > > >> > > > > >> > implemented
> > > > >> > > > > >> > > > for the DeclarativeSlotManager but missed for the
> > > > >> > > > > >> > FineGrainedSlotManager.
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > > 2. DeclarativeSlotManager and
> > FineGrainedSlotManager
> > > feel
> > > > >> > > quite
> > > > >> > > > > big
> > > > >> > > > > >> in
> > > > >> > > > > >> > > > terms of lines of code. Without knowing whether
> > it's
> > > > >> > actually
> > > > >> > > a
> > > > >> > > > > >> > > reasonable
> > > > >> > > > > >> > > > thing to do: Instead of just adding more features
> > to
> > > the
> > > > >> > > > > >> > > > FineGrainedSlotManager, have you thought of
> cutting
> > > out
> > > > >> > > > > functionality
> > > > >> > > > > >> > > into
> > > > >> > > > > >> > > > smaller sub-components along this refactoring?
> > Such a
> > > > >> > > > step-by-step
> > > > >> > > > > >> > > approach
> > > > >> > > > > >> > > > might improve the overall codebase and might make
> > > > >> reviewing
> > > > >> > > the
> > > > >> > > > > >> > > refactoring
> > > > >> > > > > >> > > > easier. I did a first pass over the code and
> > > struggled to
> > > > >> > > > identify
> > > > >> > > > > >> code
> > > > >> > > > > >> > > > blocks that could be moved out of the SlotManager
> > > > >> > > > > implementation(s).
> > > > >> > > > > >> > > > Therefore, I might be wrong with this proposal. I
> > > haven't
> > > > >> > > worked
> > > > >> > > > > on
> > > > >> > > > > >> > this
> > > > >> > > > > >> > > > codebase in that detail that it would allow me to
> > > come up
> > > > >> > > with a
> > > > >> > > > > >> > > judgement
> > > > >> > > > > >> > > > call. I wanted to bring it up, anyway, because
> I'm
> > > > >> curious
> > > > >> > > > whether
> > > > >> > > > > >> that
> > > > >> > > > > >> > > > could be an option. There's a comment created by
> > > Chesnay
> > > > >> > > (CC'd)
> > > > >> > > > in
> > > > >> > > > > >> the
> > > > >> > > > > >> > > > JavaDoc of TaskExecutorManager [3] indicating
> > > something
> > > > >> > > similar.
> > > > >> > > > > I'm
> > > > >> > > > > >> > > > wondering whether he can add some insights here.
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > > 3. For me personally, having a more detailed
> > summary
> > > > >> > comparing
> > > > >> > > > the
> > > > >> > > > > >> > > > subcomponents of both SlotManager implementations
> > > with
> > > > >> where
> > > > >> > > > > >> > > > their functionality matches and where they differ
> > > might
> > > > >> help
> > > > >> > > > > >> understand
> > > > >> > > > > >> > > the
> > > > >> > > > > >> > > > consequences of the changes proposed in FLIP-298.
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > > Best,
> > > > >> > > > > >> > > > Matthias
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > > [1]
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > >
> > > > >> > > > > >> >
> > > > >> > > > > >>
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box
> > > > >> > > > > >> > > > [2]
> > > https://issues.apache.org/jira/browse/FLINK-30338
> > > > >> > > > > >> > > > [3]
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > >
> > > > >> > > > > >> >
> > > > >> > > > > >>
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > >
> >
> https://github.com/apache/flink/blob/f611ea8cb5deddb42429df2c99f0c68d7382e9bd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java#L66-L68
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > > On Tue, Feb 28, 2023 at 6:14 AM Matt Wang <
> > > > >> wang...@163.com>
> > > > >> > > > > wrote:
> > > > >> > > > > >> > > >
> > > > >> > > > > >> > > >> This is a good proposal for me, it will make the
> > > code of
> > > > >> > the
> > > > >> > > > > >> > SlotManager
> > > > >> > > > > >> > > >> more clear.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> --
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Best,
> > > > >> > > > > >> > > >> Matt Wang
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> ---- Replied Message ----
> > > > >> > > > > >> > > >> | From | David Morávek<d...@apache.org> |
> > > > >> > > > > >> > > >> | Date | 02/27/2023 22:45 |
> > > > >> > > > > >> > > >> | To | <dev@flink.apache.org> |
> > > > >> > > > > >> > > >> | Subject | Re: [DISCUSS] FLIP-298: Unifying the
> > > > >> > > Implementation
> > > > >> > > > > of
> > > > >> > > > > >> > > >> SlotManager |
> > > > >> > > > > >> > > >> Hi Weihua, I still need to dig into the details,
> > > but the
> > > > >> > > > overall
> > > > >> > > > > >> > > sentiment
> > > > >> > > > > >> > > >> of this change sounds reasonable.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Best,
> > > > >> > > > > >> > > >> D.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> On Mon, Feb 27, 2023 at 2:26 PM Zhanghao Chen <
> > > > >> > > > > >> > > zhanghao.c...@outlook.com>
> > > > >> > > > > >> > > >> wrote:
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Thanks for driving this topic. I think this FLIP
> > > could
> > > > >> help
> > > > >> > > > > clean up
> > > > >> > > > > >> > the
> > > > >> > > > > >> > > >> codebase to make it easier to maintain. +1 on
> it.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Best,
> > > > >> > > > > >> > > >> Zhanghao Chen
> > > > >> > > > > >> > > >> ________________________________
> > > > >> > > > > >> > > >> From: Weihua Hu <huweihua....@gmail.com>
> > > > >> > > > > >> > > >> Sent: Monday, February 27, 2023 20:40
> > > > >> > > > > >> > > >> To: dev <dev@flink.apache.org>
> > > > >> > > > > >> > > >> Subject: [DISCUSS] FLIP-298: Unifying the
> > > > >> Implementation of
> > > > >> > > > > >> > SlotManager
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Hi everyone,
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> I would like to begin a discussion on FLIP-298:
> > > Unifying
> > > > >> > the
> > > > >> > > > > >> > > Implementation
> > > > >> > > > > >> > > >> of SlotManager[1]. There are currently two types
> > of
> > > > >> > > SlotManager
> > > > >> > > > > in
> > > > >> > > > > >> > > Flink:
> > > > >> > > > > >> > > >> DeclarativeSlotManager and
> FineGrainedSlotManager.
> > > > >> > > > > >> > > FineGrainedSlotManager
> > > > >> > > > > >> > > >> should behave as DeclarativeSlotManager if the
> > user
> > > does
> > > > >> > not
> > > > >> > > > > >> configure
> > > > >> > > > > >> > > the
> > > > >> > > > > >> > > >> slot request profile.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Therefore, this FLIP aims to unify the
> > > implementation of
> > > > >> > > > > SlotManager
> > > > >> > > > > >> > in
> > > > >> > > > > >> > > >> order to reduce maintenance costs.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Looking forward to hearing from you.
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> [1]
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > >
> > > > >> > > > > >> >
> > > > >> > > > > >>
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-298%3A+Unifying+the+Implementation+of+SlotManager
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >> Best,
> > > > >> > > > > >> > > >> Weihua
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > > >>
> > > > >> > > > > >> > >
> > > > >> > > > > >> >
> > > > >> > > > > >>
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > >
> >
>

Reply via email to