[ https://issues.apache.org/jira/browse/FLINK-17598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Canbin Zheng updated FLINK-17598: --------------------------------- Description: At the moment we use Zookeeper as a distributed coordinator for implementing JobManager high availability services. But in the cloud-native environment, there is a trend that more and more users prefer to use *Kubernetes* as the underlying scheduler backend while *Storage Object* as the Storage medium, both of these two services don't require Zookeeper deployment. As a result, in the K8s setups, people have to deploy and maintain their Zookeeper clusters for solving JobManager SPOF. This ticket proposes to provide a simplified FileSystem HA implementation with the leader-election removed, which saves the efforts of Zookeeper deployment. To achieve this, we plan to # Introduce a {{FileSystemHaServices}} which implements the {{HighAvailabilityServices}}. # Replace Deployment with StatefulSet to ensure *at most one* semantics, preventing potential concurrent access to the underlying FileSystem. was: At the moment we use Zookeeper as a distributed coordinator for implementing JobManager high availability services. But in the cloud-native environment, there is a trend that more and more users prefer to use *Kubernetes* as the underlying scheduler backend while *Storage Object* as the Storage medium, both of these two services don't require Zookeeper deployment. As a result, in the K8s setups, people have to deploy and maintain additional Zookeeper clusters for solving JobManager SPOF. This ticket proposes to provide a simplified FileSystem HA implementation with the leader-election removed, it saves the efforts of Zookeeper deployment and maintenance. To achieve this, we plan to # Introduce the {{FileSystemHaServices}} which implements the {{HighAvailabilityServices}}. # Replace Deployment with StatefulSet to ensure *at most one* semantics to avoid potential concurrent access to the underlying FileSystem. > Implement FileSystemHAServices for native K8s setups > ---------------------------------------------------- > > Key: FLINK-17598 > URL: https://issues.apache.org/jira/browse/FLINK-17598 > Project: Flink > Issue Type: New Feature > Components: Deployment / Kubernetes, Runtime / Coordination > Reporter: Canbin Zheng > Priority: Major > > At the moment we use Zookeeper as a distributed coordinator for implementing > JobManager high availability services. But in the cloud-native environment, > there is a trend that more and more users prefer to use *Kubernetes* as the > underlying scheduler backend while *Storage Object* as the Storage medium, > both of these two services don't require Zookeeper deployment. > As a result, in the K8s setups, people have to deploy and maintain their > Zookeeper clusters for solving JobManager SPOF. This ticket proposes to > provide a simplified FileSystem HA implementation with the leader-election > removed, which saves the efforts of Zookeeper deployment. > To achieve this, we plan to > # Introduce a {{FileSystemHaServices}} which implements the > {{HighAvailabilityServices}}. > # Replace Deployment with StatefulSet to ensure *at most one* semantics, > preventing potential concurrent access to the underlying FileSystem. -- This message was sent by Atlassian Jira (v8.3.4#803005)