Hi devs, Recently the community excludes customize support on new restart strategies[1], which reminds me to think of which kind of customized support a framework like Flink should provides.
The key idea is pluggable is not customizable. We might handle a series of implementation of restart strategies as well as high-availability services in our codebase. But it has a fixed size, which is definitely different from support arbitrarily customized. For a services like high-availability services, it underneath relies on quite a lot of runtime implementations. For example, JobGraphStore supports #releaseJobGraphStore originally due to ZK lock strategy; getJobManagerRetriever requires default address because StandaloneHighAvailabilityServices is non-ha and pre-configured. This kind of interfaces, however, are possibly evolves with flink runtime implementation such as cluster management and coordination details. If we support customizing it, it means such internal a high-availability services becomes public interfaces. If we keep it pluggable, we can extend it reacting to runtime evolution, ensuring the implementations stay in a fixed set; while introducing new implementation(such as etcd[2] or MapDB[3]) if they are good fit. We don't have a customize support on ResourceManager although it is pluggable that others can implement a kubernetes resource manager[4]. Maybe this is a better way how we handle high-availability services. Pluggable, but not customizable. Looking forward to your ideas. To be clear, I'm not trying to drop it now, but I'm a bit confusing about this topic and would like to turn to the wisdom in our community. Best, tison. [1] https://lists.apache.org/x/thread.html/6ed95eb6a91168dba09901e158bc1b6f4b08f1e176db4641f79de765@%3Cdev.flink.apache.org%3E [2] https://issues.apache.org/jira/browse/FLINK-11105 [3] https://lists.apache.org/x/thread.html/eae4cbdf6dac466bc0247e3bc1a7a69fe7e1db7a512fcd607e9c081b@%3Cuser.flink.apache.org%3E [4] https://github.com/tianchen92/flink/tree/k8s-master/flink-kubernete