xintongsong commented on a change in pull request #11353: [FLINK-16438][yarn] Make YarnResourceManager starts workers using WorkerResourceSpec requested by SlotManager URL: https://github.com/apache/flink/pull/11353#discussion_r403840142
########## File path: flink-yarn/src/main/java/org/apache/flink/yarn/configuration/YarnConfigOptionsInternal.java ########## @@ -34,4 +37,24 @@ .stringType() .noDefaultValue() .withDescription("**DO NOT USE** The location of the log config file, e.g. the path to your log4j.properties for log4j."); + + /** + * **DO NO USE** Whether {@link YarnResourceManager} should match the vcores of allocated containers with those requested. + * + * <p>By default, Yarn ignores vcores in the container requests, and always allocate 1 vcore for each container. + * Iff 'yarn.scheduler.capacity.resource-calculator' is set to 'DominantResourceCalculator' for Yarn, will it + * allocate container vcores as requested. Unfortunately, this configuration option is dedicated for Yarn Scheduler, + * and is only accessible to applications in Hadoop 2.6+. + * + * <p>ATM, it should be fine to not match vcores, because with the current {@link SlotManagerImpl} all the TM + * containers should have the same resources. + * + * <p>If later we add another {@link SlotManager} implementation that may have TMs with different resources, we can + * switch this option on only for the new SM, and the new SM can also be available on Hadoop 2.6+ only. + */ + public static final ConfigOption<Boolean> MATCH_CONTAINER_VCORES = + key("$internal.yarn.resourcemanager.enable-vcore-matching") + .booleanType() + .defaultValue(false) + .withDescription("**DO NOT USE** Whether YarnResourceManager should match the container vcores."); Review comment: For the time being, yes. Hadoop supports programmatically get this configuration from `RegisterApplicationMasterResponse` starting from 2.6.x, so we don't need users to manually configure this. But I believe the lowest Hadoop version Flink supports is 2.4.x, so we do not have a good way other than having user configure it. This option should only affect the cases with dynamic worker resources. If the option is not set on a Yarn cluster that matches vcores, then workers with different cpu but same memory may not be schedules correctly. E.g., if Flink wants to start task executor `t1` in a container with resources `<1GB, 1 vcore>`, and `t2` in a container with resources `<1GB, 2 vcore>`, the actually resource available to `t1` might be `<1GB, 2 vcore>` and to `t2` might be `<1GB, 1 vcore>`. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services