HuangZhenQiu commented on a change in pull request #8952: URL: https://github.com/apache/flink/pull/8952#discussion_r547734136
########## File path: flink-core/src/main/java/org/apache/flink/configuration/ResourceManagerOptions.java ########## @@ -67,6 +67,33 @@ "for streaming workloads, which may fail if there are not enough slots. Note that this configuration option does not take " + "effect for standalone clusters, where how many slots are allocated is not controlled by Flink."); + /** + * Defines the maximum number of worker (YARN / Mesos) failures per minute before rejecting subsequent worker + * requests until the failure rate falls below the maximum. It is to quickly catch external dependency caused + * workers failure and wait for retry interval before sending new request. Be default, -1.0 is set to disable the feature. + */ + public static final ConfigOption<Double> MAXIMUM_WORKERS_FAILURE_RATE = ConfigOptions + .key("resourcemanager.maximum-workers-failure-rate") + .doubleType() + .defaultValue(-1.0) Review comment: It was the original value before combining the logic of the retry interval. I agree that a reasonable value such as 10/min should be the default value. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org