Thanks for your clarification. I have nothing else to add to the discussion. +1 from my side to proceed
On Wed, Mar 8, 2023 at 4:16 AM Weihua Hu <huweihua....@gmail.com> wrote: > Thanks Yangze for your attention, this would be a great help. > > And thanks Matthias too. > > FLIP-156 [1] mentions some incompatibility between fine-grained resource > > management and reactive mode. I assume that this is independent of the > > SlotManager and replacing the DSM with the FGSM wouldn't affect reactive > > mode? > > Yes. This incompatibility is independent of SlotManager. That means the > AdpativeScheduler will always ignore the resource requirement set by > slotSharingGroup and declare Unknown ResourceProfile to SlotManager. > So, using FGSM as default will not affect reactive mode. > > About the heterogeneous TaskManager: This is a feature that's also not > > supported in the DSM right now, is it? We should state that fact in the > > FLIP if we mentioned that we don't want to implement it for the FSGM. > > Yes, both DSM and FGSM do not support request heterogeneous > TaskManager right now. Heterogeneous will make the resource allocation > logic more complicated, such as the resource deadlock if request A > allocated > the bigger slot B and then request B could not allocate the small slot A. > We > need to think more before starting to support the heterogeneous task > manager. > So, we don't want to implement heterogeneity in this FLIP. > > Best, > Weihua > > > On Wed, Mar 8, 2023 at 12:44 AM Matthias Pohl > <matthias.p...@aiven.io.invalid> wrote: > > > Thanks for updating the FLIP and adding more context to it. Additionally, > > thanks to Xintong and Yangze for offering your expertise here as > > contributors to the initial FineGrainedSlotManager implementation. > > > > The remark on cutting out functionality was only based on some > superficial > > initial code reading. I cannot come up with a better code structure > myself. > > Therefore, I'm fine with not refactoring the code as part of this FLIP. > > > > The strategies that were proposed around making sure that the refactoring > > is properly backed by tests sound reasonable. My initial concern was > based > > on the fact that we might have unit test scenarios for the DSM that are > not > > covered in the unit tests of the FSGM. In that case, swapping the DSM > with > > the FSGM might not be good enough. Going over the GSM tests to make sure > > that we're not accidentally deleting test scenarios sounds good to me. > > Thanks, Weihua. > > > > FLIP-156 [1] mentions some incompatibility between fine-grained resource > > management and reactive mode. I assume that this is independent of the > > SlotManager and replacing the DSM with the FGSM wouldn't affect reactive > > mode? > > > > About the heterogeneous TaskManager: This is a feature that's also not > > supported in the DSM right now, is it? We should state that fact in the > > FLIP if we mentioned that we don't want to implement it for the FSGM. > > > > Best, > > Matthias > > > > [1] > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-156%3A+Runtime+Interfaces+for+Fine-Grained+Resource+Requirements > > > > On Tue, Mar 7, 2023 at 8:58 AM Yangze Guo <karma...@gmail.com> wrote: > > > > > Hi Weihua, > > > > > > Thanks for driving this. As Xintong mentioned, this was a technical > > > debt from FLIP-56. > > > > > > The latest version of FLIP sounds good, +1 from my side. As a > > > contributor to this component, I'm willing to assist with the review > > > process. Feel free to reach me if you need help. > > > > > > Best, > > > Yangze Guo > > > > > > On Tue, Mar 7, 2023 at 1:47 PM Weihua Hu <huweihua....@gmail.com> > wrote: > > > > > > > > Hi, > > > > > > > > @David @Matthias > > > > There are a few days after hearing your thoughts. I would like to > know > > if > > > > there are any other concerns about this FLIP. > > > > > > > > > > > > Best, > > > > Weihua > > > > > > > > > > > > On Mon, Mar 6, 2023 at 7:53 PM Weihua Hu <huweihua....@gmail.com> > > wrote: > > > > > > > > > > > > > > Thanks Shammon, > > > > > > > > > > I've updated FLIP to add this redundant Task Manager limitation. > > > > > > > > > > > > > > > Best, > > > > > Weihua > > > > > > > > > > > > > > > On Mon, Mar 6, 2023 at 5:07 PM Shammon FY <zjur...@gmail.com> > wrote: > > > > > > > > > >> Hi weihua > > > > >> > > > > >> Can you add content related to `heterogeneous resources` to this > > > FLIP? We > > > > >> can record it and consider it in the future. It may be useful for > > some > > > > >> scenarios, such as the combination of streaming and ML. > > > > >> > > > > >> Best, > > > > >> Shammon > > > > >> > > > > >> > > > > >> On Mon, Mar 6, 2023 at 1:39 PM weijie guo < > > guoweijieres...@gmail.com> > > > > >> wrote: > > > > >> > > > > >> > Hi Weihua, > > > > >> > > > > > >> > Thanks for your clarification, SGTM. > > > > >> > > > > > >> > Best regards, > > > > >> > > > > > >> > Weijie > > > > >> > > > > > >> > > > > > >> > Weihua Hu <huweihua....@gmail.com> 于2023年3月6日周一 11:43写道: > > > > >> > > > > > >> > > Thanks Weijie. > > > > >> > > > > > > >> > > Heterogeneous task managers will not be considered in this > FLIP > > > since > > > > >> > > it does not request heterogeneous resources as you said. > > > > >> > > > > > > >> > > My first thought is we can adjust the meaning of redundant > > > > >> configuration > > > > >> > > to redundant number of per resource type. These can be > > considered > > > in > > > > >> > > detail when we decide to support heterogeneous task managers. > > > > >> > > > > > > >> > > Best, > > > > >> > > Weihua > > > > >> > > > > > > >> > > > > > > >> > > On Sat, Mar 4, 2023 at 1:13 AM weijie guo < > > > guoweijieres...@gmail.com> > > > > >> > > wrote: > > > > >> > > > > > > >> > > > Thanks Weihua for preparing this FLIP. > > > > >> > > > > > > > >> > > > This FLIP overall looks reasonable to me after updating as > > > > >> suggested by > > > > >> > > > Matthias. > > > > >> > > > > > > > >> > > > I only have one small question about keeping some redundant > > task > > > > >> > > managers: > > > > >> > > > In the fine-grained resource management, theoretically, it > can > > > > >> support > > > > >> > > > heterogeneous taskmanagers. When we complete the missing > > > features > > > > >> for > > > > >> > > FGSM, > > > > >> > > > do we plan to take this into account? > > > > >> > > > Of course, if I remember correctly, FGSM will not request > > > > >> heterogeneous > > > > >> > > > resources at present, so it is also acceptable to me if > there > > > is no > > > > >> > > special > > > > >> > > > treatment now. > > > > >> > > > > > > > >> > > > +1 for this changes if we can ensure the test coverage. > > > > >> > > > > > > > >> > > > Best regards, > > > > >> > > > > > > > >> > > > Weijie > > > > >> > > > > > > > >> > > > > > > > >> > > > John Roesler <vvcep...@apache.org> 于2023年3月2日周四 12:53写道: > > > > >> > > > > > > > >> > > > > Thanks for the test plan, Weihua! > > > > >> > > > > > > > > >> > > > > Yes, it addresses my concerns. > > > > >> > > > > > > > > >> > > > > Thanks, > > > > >> > > > > John > > > > >> > > > > > > > > >> > > > > On Wed, Mar 1, 2023, at 22:38, Weihua Hu wrote: > > > > >> > > > > > Hi, everyone, > > > > >> > > > > > Thanks for your suggestions and ideas. > > > > >> > > > > > Thanks Xintong for sharing the detailed backgrounds of > > > > >> SlotManager. > > > > >> > > > > > > > > > >> > > > > > *@Matthias > > > > >> > > > > > > > > > >> > > > > > 1. Did you do a proper test coverage analysis? > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > Just as Xintong said, we already have a CI stage for > fine > > > > >> grained > > > > >> > > > > resource > > > > >> > > > > > managers. > > > > >> > > > > > And I will make sure FineGrainedSlotManager as the > default > > > > >> > > SlotManager > > > > >> > > > > can > > > > >> > > > > > pass all the tests of CI. > > > > >> > > > > > In addition, I will review all unit tests of > > > > >> > > > DeclarativeSlotManager(DSM) > > > > >> > > > > to > > > > >> > > > > > ensure that there are no gaps in the > > > > >> > > > > > coverage provided by the FineGrainedSlotManager. > > > > >> > > > > > I also added the 'Test Plan' part to the FLIP. > > > > >> > > > > > @Matthias @John @Shammon Does this test plan address > your > > > > >> concerns? > > > > >> > > > > > > > > > >> > > > > > 2. DeclarativeSlotManager and FineGrainedSlotManager > feel > > > quite > > > > >> > big > > > > >> > > in > > > > >> > > > > > > > > > >> > > > > > terms of lines of code.... > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > IMO, the refactoring of SlotManager does not belong to > > this > > > FLIP > > > > >> > > since > > > > >> > > > it > > > > >> > > > > > may lead to some unstable risks. For > > > > >> > > > > > FineGrainedSlotManager(FGSM), we already split some > > > reasonable > > > > >> > > > > components. > > > > >> > > > > > They are: > > > > >> > > > > > * TaskManagerTracker: Track task managers and their > > > resources. > > > > >> > > > > > * ResourceTracker: track requirements of jobs > > > > >> > > > > > * ResourceAllocationStrategy: Try to fulfill the > resource > > > > >> > > requirements > > > > >> > > > > with > > > > >> > > > > > available/pending resources. > > > > >> > > > > > * SlotStatusSyncer: communicate with TaskManager, for > > > > >> > > > allocating/freeing > > > > >> > > > > > slot and reconciling the slot status > > > > >> > > > > > Maybe we can start a discussion about refactoring > > > SlotManager in > > > > >> > > > another > > > > >> > > > > > FLIP if there are some good suggestions. > > > > >> > > > > > WDYT > > > > >> > > > > > > > > > >> > > > > > 3. For me personally, having a more detailed summary > > > comparing > > > > >> the > > > > >> > > > > >> subcomponents of both SlotManager implementations with > > > where > > > > >> > > > > >> their functionality matches and where they differ might > > > help > > > > >> > > > understand > > > > >> > > > > the > > > > >> > > > > >> consequences of the changes proposed in FLIP-298 > > > > >> > > > > > > > > > >> > > > > > Good suggestion, I have updated the comparison in this > > FLIP. > > > > >> > Looking > > > > >> > > > > > forward to any suggestions/thoughts > > > > >> > > > > > if they are not described clearly. > > > > >> > > > > > > > > > >> > > > > > *@John > > > > >> > > > > > > > > > >> > > > > > 4. In addition to changing the default, would it make > > sense > > > to > > > > >> log > > > > >> > a > > > > >> > > > > >> deprecation warning on initialization > > > > >> > > > > > > > > > >> > > > > > if the DeclarativeSlotManager is used? > > > > >> > > > > >> > > > > >> > > > > > SGTM, We should add Deprecated annotations to DSM for > > devs. > > > And > > > > >> > log a > > > > >> > > > > > deprecation warning for users. > > > > >> > > > > > > > > > >> > > > > > *@Shammon > > > > >> > > > > > > > > > >> > > > > > 1. For their functional differences, can you give some > > > detailed > > > > >> > tests > > > > >> > > > to > > > > >> > > > > >> verify that the new FineGrainedSlotManager has these > > > > >> capabilities? > > > > >> > > > This > > > > >> > > > > can > > > > >> > > > > >> effectively verify the new functions > > > > >> > > > > >> > > > > >> > > > > > As just maintained, there is already a CI stage of FGSM, > > > and I > > > > >> will > > > > >> > > do > > > > >> > > > > more > > > > >> > > > > > review of unit tests for DSM. > > > > >> > > > > > > > > > >> > > > > > 2. I'm worried that many functions are not independent > > and > > > it > > > > >> is > > > > >> > > > > difficult > > > > >> > > > > >> to migrate step-by-step. You can list the relationship > > > between > > > > >> > them > > > > >> > > in > > > > >> > > > > >> detail. > > > > >> > > > > > > > > > >> > > > > > As Xintong saied the DSM is a subset of FGSM by design. > > > But as > > > > >> > time > > > > >> > > > goes > > > > >> > > > > > on, FGSM has some lacking > > > > >> > > > > > functions as I listed in this FLIP. And I have added the > > > > >> comparison > > > > >> > > > > between > > > > >> > > > > > DSM and FGSM in this FLIP. > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > Thanks again for all your thoughts. Any feedback is > > > appreciated! > > > > >> > > > > > > > > > >> > > > > > Best, > > > > >> > > > > > Weihua > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > On Wed, Mar 1, 2023 at 2:17 PM Xintong Song < > > > > >> tonysong...@gmail.com > > > > >> > > > > > > >> > > > > wrote: > > > > >> > > > > > > > > > >> > > > > >> Thanks Weihua for preparing this FLIP. +1 for the > > proposal. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> As one of the contributors of the fine-grained slot > > > manager, > > > > >> I'd > > > > >> > > like > > > > >> > > > to > > > > >> > > > > >> share some backgrounds here. > > > > >> > > > > >> > > > > >> > > > > >> - There used to be a defaut slot manager > implementation, > > > which > > > > >> is > > > > >> > > > > >> non-declarative and has been removed now. The two > > features, > > > > >> > > > declarative > > > > >> > > > > / > > > > >> > > > > >> reactive resource management and fine-grained resource > > > > >> management, > > > > >> > > > were > > > > >> > > > > >> proposed at about the same time. We were aware that by > > > design > > > > >> the > > > > >> > > > > >> declarative slot manager is a subset of fine-grained > slot > > > > >> manager > > > > >> > at > > > > >> > > > > that > > > > >> > > > > >> time, but still decided to implement two slot managers > > for > > > the > > > > >> > > purpose > > > > >> > > > > of > > > > >> > > > > >> decoupling efforts and reducing cross-team > > synchronization > > > > >> > overhead. > > > > >> > > > > >> Merging the two slot managers once they are proved > stable > > > is > > > > >> IMO a > > > > >> > > > > >> technical debt that was planned at the very beginning. > > > > >> > > > > >> > > > > >> > > > > >> - The FineGrainedSlotManager has been verified in > > Alibaba's > > > > >> > internal > > > > >> > > > > >> production as well as Alibaba Cloud services as the > > default > > > > >> slot > > > > >> > > > manager > > > > >> > > > > >> for about 2 years. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Concerning test cases, we currently have a ci stage for > > > fine > > > > >> > grained > > > > >> > > > > >> resource management. To avoid adding too much burden, > the > > > stage > > > > >> > only > > > > >> > > > > >> includes tests from flink-runtime and flink-test > > modules. I > > > > >> think > > > > >> > > > > switching > > > > >> > > > > >> the default slot manager and applying the whole set of > > > tests on > > > > >> > the > > > > >> > > > > >> fine-grained slot manager would help us to be more > > > confident > > > > >> about > > > > >> > > it. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Concerning cutting out functionalities out of slot > > > manager, I > > > > >> > think > > > > >> > > > > Yangze > > > > >> > > > > >> and I have tried our best to shape the > > > FineGrainedSlotManager > > > > >> into > > > > >> > > > > >> reasonable components. I personally don't have other > > ideas > > > to > > > > >> > > further > > > > >> > > > > >> disassemble the component, but I'm open to such > > > suggestions. > > > > >> > > However, > > > > >> > > > > from > > > > >> > > > > >> the stability perspective, I'd be in favor of not > > > introducing > > > > >> > > > > significant > > > > >> > > > > >> changes to the FineGrainedSlotManager while switching > it > > > to the > > > > >> > > > default. > > > > >> > > > > >> Because the current implementation has already been > > > verified > > > > >> (or > > > > >> > at > > > > >> > > > > least > > > > >> > > > > >> partially verified because Alibaba does not cover all > the > > > Flink > > > > >> > use > > > > >> > > > > cases), > > > > >> > > > > >> and introducing more changes also means more chances of > > > > >> breaking > > > > >> > > > things. > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> Best, > > > > >> > > > > >> > > > > >> > > > > >> Xintong > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> On Wed, Mar 1, 2023 at 11:12 AM Shammon FY < > > > zjur...@gmail.com> > > > > >> > > wrote: > > > > >> > > > > >> > > > > >> > > > > >> > Hi > > > > >> > > > > >> > > > > > >> > > > > >> > Thanks for starting this work weihua, I think > unifying > > > > >> > > > > >> > DeclarativeSlotManager and FineGrainedSlotManager is > > > > >> valuable. > > > > >> > > > > >> > > > > > >> > > > > >> > I agree with @Matthias and @John that we need a way > to > > > ensure > > > > >> > that > > > > >> > > > > >> > DeclarativeSlotManager's capabilities are fully > covered > > > by > > > > >> > > > > >> > FineGrainedSlotManager > > > > >> > > > > >> > > > > > >> > > > > >> > 1. For their functional differences, can you give > some > > > > >> detailed > > > > >> > > > tests > > > > >> > > > > to > > > > >> > > > > >> > verify that the new FineGrainedSlotManager has these > > > > >> > capabilities? > > > > >> > > > > This > > > > >> > > > > >> can > > > > >> > > > > >> > effectively verify the new functions > > > > >> > > > > >> > > > > > >> > > > > >> > 2. I'm worried that many functions are not > independent > > > and > > > > >> it is > > > > >> > > > > >> difficult > > > > >> > > > > >> > to migrate step-by-step. You can list the > relationship > > > > >> between > > > > >> > > them > > > > >> > > > in > > > > >> > > > > >> > detail. > > > > >> > > > > >> > > > > > >> > > > > >> > 3. As John mentioned, give a smoke test for > > > > >> > FineGrainedSlotManager > > > > >> > > > is > > > > >> > > > > a > > > > >> > > > > >> > good idea. Or you can add some test information to > the > > > > >> > > > > >> > DeclarativeSlotManager to determine how many tests > have > > > used > > > > >> it. > > > > >> > > In > > > > >> > > > > this > > > > >> > > > > >> > way, we can gradually construct test cases for > > > > >> > > > FineGrainedSlotManager > > > > >> > > > > >> > during the development process. > > > > >> > > > > >> > > > > > >> > > > > >> > > > > > >> > > > > >> > Best, > > > > >> > > > > >> > Shammon > > > > >> > > > > >> > > > > > >> > > > > >> > > > > > >> > > > > >> > On Tue, Feb 28, 2023 at 10:22 PM John Roesler < > > > > >> > > vvcep...@apache.org> > > > > >> > > > > >> wrote: > > > > >> > > > > >> > > > > > >> > > > > >> > > Thanks for the FLIP, Weihua! > > > > >> > > > > >> > > > > > > >> > > > > >> > > I’ve read the FLIP, and it sounds good to me. We > need > > > to > > > > >> avoid > > > > >> > > > > >> > > proliferating alternative implementations wherever > > > > >> possible. I > > > > >> > > > have > > > > >> > > > > >> just > > > > >> > > > > >> > a > > > > >> > > > > >> > > couple of comments: > > > > >> > > > > >> > > > > > > >> > > > > >> > > 1. I share Matthias’s concern about ensuring the > > > behavior > > > > >> is > > > > >> > > > really > > > > >> > > > > the > > > > >> > > > > >> > > same. One suggestion I’ve used for this kind of > thing > > > is, > > > > >> as a > > > > >> > > > smoke > > > > >> > > > > >> > test, > > > > >> > > > > >> > > to update the DeclarativeSlotManager to just > delegate > > > to > > > > >> the > > > > >> > > > > >> > > FineGrainedSlotManager. If the full test suite > still > > > > >> passes, > > > > >> > you > > > > >> > > > > can be > > > > >> > > > > >> > > pretty sure the new default is really ok. It would > > not > > > be a > > > > >> > good > > > > >> > > > > idea > > > > >> > > > > >> to > > > > >> > > > > >> > > actually keep that in for the release, since it > would > > > > >> remove > > > > >> > the > > > > >> > > > > option > > > > >> > > > > >> > to > > > > >> > > > > >> > > fall back in case of bugs. Either way, we need to > > make > > > sure > > > > >> > all > > > > >> > > > test > > > > >> > > > > >> > > scenarios are present for the FGSM. > > > > >> > > > > >> > > > > > > >> > > > > >> > > 4. In addition to changing the default, would it > make > > > > >> sense to > > > > >> > > > log a > > > > >> > > > > >> > > deprecation warning on initialization if the > > > > >> > > > DeclarativeSlotManager > > > > >> > > > > is > > > > >> > > > > >> > used? > > > > >> > > > > >> > > > > > > >> > > > > >> > > Thanks again, > > > > >> > > > > >> > > John > > > > >> > > > > >> > > > > > > >> > > > > >> > > On Tue, Feb 28, 2023, at 07:20, Matthias Pohl > wrote: > > > > >> > > > > >> > > > Hi Weihua, > > > > >> > > > > >> > > > Thanks for your proposal. From a conceptual > point: > > > AFAIU, > > > > >> > the > > > > >> > > > > >> > > > DeclarativeSlotManager covers a subset (i.e. only > > > evenly > > > > >> > sized > > > > >> > > > > slots) > > > > >> > > > > >> > of > > > > >> > > > > >> > > > what the FineGrainedSlotManager should be able to > > > achieve > > > > >> > > > > (variable > > > > >> > > > > >> > slot > > > > >> > > > > >> > > > size per task manager). Is this the right > > > > >> > > > > assumption/understanding? > > > > >> > > > > >> In > > > > >> > > > > >> > > this > > > > >> > > > > >> > > > sense, merging both implementations into a single > > one > > > > >> sounds > > > > >> > > > > good. A > > > > >> > > > > >> > few > > > > >> > > > > >> > > > more general comments, though: > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > 1. Did you do a proper test coverage analysis? > > > That's not > > > > >> > > > > mentioned > > > > >> > > > > >> in > > > > >> > > > > >> > > the > > > > >> > > > > >> > > > current version of the FLIP. I'm bringing this up > > > > >> because we > > > > >> > > ran > > > > >> > > > > into > > > > >> > > > > >> > the > > > > >> > > > > >> > > > same issue when fixing the flaws that popped up > > after > > > > >> > > > introducing > > > > >> > > > > the > > > > >> > > > > >> > > > multi-component leader election (see FLIP-285 > [1]). > > > There > > > > >> > is a > > > > >> > > > > risk > > > > >> > > > > >> > that > > > > >> > > > > >> > > by > > > > >> > > > > >> > > > removing the legacy code we decrease test > coverage > > > > >> because > > > > >> > > > certain > > > > >> > > > > >> > > > test cases that were covered for the legacy > classes > > > might > > > > >> > not > > > > >> > > be > > > > >> > > > > >> > > > necessarily covered in the new implementation, > yet > > > (see > > > > >> > > > > FLINK-30338 > > > > >> > > > > >> [2] > > > > >> > > > > >> > > > which covers this issue for the leader election > > > case). > > > > >> > > Ideally, > > > > >> > > > we > > > > >> > > > > >> > don't > > > > >> > > > > >> > > > want to remove test cases accidentally because > they > > > were > > > > >> > only > > > > >> > > > > >> > implemented > > > > >> > > > > >> > > > for the DeclarativeSlotManager but missed for the > > > > >> > > > > >> > FineGrainedSlotManager. > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > 2. DeclarativeSlotManager and > > FineGrainedSlotManager > > > feel > > > > >> > > quite > > > > >> > > > > big > > > > >> > > > > >> in > > > > >> > > > > >> > > > terms of lines of code. Without knowing whether > > it's > > > > >> > actually > > > > >> > > a > > > > >> > > > > >> > > reasonable > > > > >> > > > > >> > > > thing to do: Instead of just adding more features > > to > > > the > > > > >> > > > > >> > > > FineGrainedSlotManager, have you thought of > cutting > > > out > > > > >> > > > > functionality > > > > >> > > > > >> > > into > > > > >> > > > > >> > > > smaller sub-components along this refactoring? > > Such a > > > > >> > > > step-by-step > > > > >> > > > > >> > > approach > > > > >> > > > > >> > > > might improve the overall codebase and might make > > > > >> reviewing > > > > >> > > the > > > > >> > > > > >> > > refactoring > > > > >> > > > > >> > > > easier. I did a first pass over the code and > > > struggled to > > > > >> > > > identify > > > > >> > > > > >> code > > > > >> > > > > >> > > > blocks that could be moved out of the SlotManager > > > > >> > > > > implementation(s). > > > > >> > > > > >> > > > Therefore, I might be wrong with this proposal. I > > > haven't > > > > >> > > worked > > > > >> > > > > on > > > > >> > > > > >> > this > > > > >> > > > > >> > > > codebase in that detail that it would allow me to > > > come up > > > > >> > > with a > > > > >> > > > > >> > > judgement > > > > >> > > > > >> > > > call. I wanted to bring it up, anyway, because > I'm > > > > >> curious > > > > >> > > > whether > > > > >> > > > > >> that > > > > >> > > > > >> > > > could be an option. There's a comment created by > > > Chesnay > > > > >> > > (CC'd) > > > > >> > > > in > > > > >> > > > > >> the > > > > >> > > > > >> > > > JavaDoc of TaskExecutorManager [3] indicating > > > something > > > > >> > > similar. > > > > >> > > > > I'm > > > > >> > > > > >> > > > wondering whether he can add some insights here. > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > 3. For me personally, having a more detailed > > summary > > > > >> > comparing > > > > >> > > > the > > > > >> > > > > >> > > > subcomponents of both SlotManager implementations > > > with > > > > >> where > > > > >> > > > > >> > > > their functionality matches and where they differ > > > might > > > > >> help > > > > >> > > > > >> understand > > > > >> > > > > >> > > the > > > > >> > > > > >> > > > consequences of the changes proposed in FLIP-298. > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > Best, > > > > >> > > > > >> > > > Matthias > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > [1] > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-285%3A+Refactoring+LeaderElection+to+make+Flink+support+multi-component+leader+election+out-of-the-box > > > > >> > > > > >> > > > [2] > > > https://issues.apache.org/jira/browse/FLINK-30338 > > > > >> > > > > >> > > > [3] > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > https://github.com/apache/flink/blob/f611ea8cb5deddb42429df2c99f0c68d7382e9bd/flink-runtime/src/main/java/org/apache/flink/runtime/resourcemanager/slotmanager/TaskExecutorManager.java#L66-L68 > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > On Tue, Feb 28, 2023 at 6:14 AM Matt Wang < > > > > >> wang...@163.com> > > > > >> > > > > wrote: > > > > >> > > > > >> > > > > > > > >> > > > > >> > > >> This is a good proposal for me, it will make the > > > code of > > > > >> > the > > > > >> > > > > >> > SlotManager > > > > >> > > > > >> > > >> more clear. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> -- > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Best, > > > > >> > > > > >> > > >> Matt Wang > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> ---- Replied Message ---- > > > > >> > > > > >> > > >> | From | David Morávek<d...@apache.org> | > > > > >> > > > > >> > > >> | Date | 02/27/2023 22:45 | > > > > >> > > > > >> > > >> | To | <dev@flink.apache.org> | > > > > >> > > > > >> > > >> | Subject | Re: [DISCUSS] FLIP-298: Unifying the > > > > >> > > Implementation > > > > >> > > > > of > > > > >> > > > > >> > > >> SlotManager | > > > > >> > > > > >> > > >> Hi Weihua, I still need to dig into the details, > > > but the > > > > >> > > > overall > > > > >> > > > > >> > > sentiment > > > > >> > > > > >> > > >> of this change sounds reasonable. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Best, > > > > >> > > > > >> > > >> D. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> On Mon, Feb 27, 2023 at 2:26 PM Zhanghao Chen < > > > > >> > > > > >> > > zhanghao.c...@outlook.com> > > > > >> > > > > >> > > >> wrote: > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Thanks for driving this topic. I think this FLIP > > > could > > > > >> help > > > > >> > > > > clean up > > > > >> > > > > >> > the > > > > >> > > > > >> > > >> codebase to make it easier to maintain. +1 on > it. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Best, > > > > >> > > > > >> > > >> Zhanghao Chen > > > > >> > > > > >> > > >> ________________________________ > > > > >> > > > > >> > > >> From: Weihua Hu <huweihua....@gmail.com> > > > > >> > > > > >> > > >> Sent: Monday, February 27, 2023 20:40 > > > > >> > > > > >> > > >> To: dev <dev@flink.apache.org> > > > > >> > > > > >> > > >> Subject: [DISCUSS] FLIP-298: Unifying the > > > > >> Implementation of > > > > >> > > > > >> > SlotManager > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Hi everyone, > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> I would like to begin a discussion on FLIP-298: > > > Unifying > > > > >> > the > > > > >> > > > > >> > > Implementation > > > > >> > > > > >> > > >> of SlotManager[1]. There are currently two types > > of > > > > >> > > SlotManager > > > > >> > > > > in > > > > >> > > > > >> > > Flink: > > > > >> > > > > >> > > >> DeclarativeSlotManager and > FineGrainedSlotManager. > > > > >> > > > > >> > > FineGrainedSlotManager > > > > >> > > > > >> > > >> should behave as DeclarativeSlotManager if the > > user > > > does > > > > >> > not > > > > >> > > > > >> configure > > > > >> > > > > >> > > the > > > > >> > > > > >> > > >> slot request profile. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Therefore, this FLIP aims to unify the > > > implementation of > > > > >> > > > > SlotManager > > > > >> > > > > >> > in > > > > >> > > > > >> > > >> order to reduce maintenance costs. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Looking forward to hearing from you. > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> [1] > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-298%3A+Unifying+the+Implementation+of+SlotManager > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> Best, > > > > >> > > > > >> > > >> Weihua > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > >> > > > > >> > > > > >> > > > > > > >> > > > > >> > > > > > >> > > > > >> > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > >