I'd like to get your opinion about this idea. I found related JIRA issue FLINK-2366, but it seems to be dead. To attract your attention I copy my comment here.
As an experiment I've implemented Flink HA on top of Consul. The implementation is working fine in the "lab" but is not battle tested yet. The source code is available at https://github.com/kbialek/ flink/tree/feature/consul (flink-runtime package org.apache.flink.runtime. consul) Why?. Generally I'd like to keep as less moving parts as possible. We do not have Zookeeper running, but Consul is already in place. And in the end freedom of choice is a good thing. It would be great to see built-in Consul support in Flink someday, but if it is not expected then I suggest a little refactoring to open possibility to configure HighAvailabilityServicesFactory. As far as I can see this should be enough to inject any HA implementation. Regards, Krzysztof