xintongsong opened a new pull request #11320: [FLINK-16437] Make SlotManager allocate resource from ResourceManager at the worker granularity. URL: https://github.com/apache/flink/pull/11320 ## What is the purpose of the change This is the first step of FLINK-14106, including all the major changes inside SlotManager and changes to the RM/SM interfaces, except changes for metrics and status. At the end of this step, SlotManager should allocate resource from ResourceManager with a WorkerResourceSpec, instead of slot ResourceProfile. At this step, the WorkerResourceSpec will not be used, and the active RMs will always use `ActiveResourceManager#taskExecutorProcessSpec` for requesting TMs. We will change that in subsequent steps. ## Brief change log - e42d87754e15c3baed9258f29d91ff9719283759..82f9a631caadd9033f30cbc9229f1e6e0facfd72: Minor code clean-ups. - 14881da7d062e986e20ac548249a42457dd4f7f0: Introduce `WorkerResourceSpec`. - c3778a58f6d9440e3df540a19e69019734f38d9e: Create `SlotManagerImpl` with default `WorkerResourceSpec` in active resource manager setups. - 66f4d52b8d767d053e72cbf25da51d5831e87b78: Create SlotManagerImpl with default numSlotsPerWorker. - cd2b13e45b4cd84f7688a20b26142a382f56423a: Compute pending slot profiles inside SlotManager when allocating resource. - This means also check unfulfillable slot profiles inside SlotManager. - 484543c14aa9ce5a0d510e131ff6d354f4eacf25: ResourceManager retrieve a collection of pending workers from SlotManager, instead of number of pending slots. - 76829127134c8f10fdf7a87097638915e2859b2e: Remove numSlotsPerTaskManager from ActiveResourceManager and ContaineredTaskManagerParameters. - Since now we compute pending slot profiles inside SlotManager, ResourceManagers no longer need to be aware of number of slots per worker. - cb785eb68e49927f24ada564772be2aa8e2a9cfd: SlotManager allocate resource from ResourceManager with WorkerRequest instead of ResourceProfile. - This marks ResourceManager is no longer aware of slot ResourceProfiles. ## Verifying this change This is a refactoring work. Most of the behaviors are already covered by exist test cases. Only added a few new test cases for the added data structures and logics. - Add `WorkerResourceSpecTest`. - Add test case in `TaskExecutorProcessUtilsTest`. - Add test case in `SlotManagerImplTest`. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): (no) - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no) - The serializers: (no) - The runtime per-record code paths (performance sensitive): (no) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes) - The S3 file system connector: (no) ## Documentation - Does this pull request introduce a new feature? (no) - If yes, how is the feature documented? (not applicable)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services