[ https://issues.apache.org/jira/browse/FLINK-16194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17041491#comment-17041491 ]
Canbin Zheng edited comment on FLINK-16194 at 2/21/20 3:27 AM: --------------------------------------------------------------- This ticket does not introduce new features, simply, we just hope to help bring a better architecture that facilitates the rapid development of this submodule in the long run, meanwhile personally I prefer keeping up tightly with the community instead of maintaining an internal version. Besides that, my colleague [~tison] would like to work together on this issue. Hi, [~fly_in_gis] if you already have a similar internal version that has better architecture, I could help review the design and the code, if not, we can work together on this issue. Hi, [~trohrmann] I have started a discussion thread [1] in dev list for this issue, looking forward to your feedback. [1] [discussion thread|https://lists.apache.org/x/thread.html/r9076ab09076ab79cc010defcb61e13ad3e24409f62d6666f512bd873@%3Cdev.flink.apache.org%3E] was (Author: felixzheng): This ticket does not introduce new features, simply, we just hope to help bring a better architecture that facilitates the rapid development of this submodule in the long run, meanwhile personally I prefer keeping up tightly with the community instead of maintaining an internal version. Besides that, my colleague [~tison] would like to work together on this issue. Hi, [~fly_in_gis] if you already have a similar internal version that has better architecture, I could help review the design and the code, if not, we can work together on this issue. Hi, [~trohrmann] I have started a discussion thread [1] in dev list for this issue, looking forward to your feedback. [1][[https://lists.apache.org/x/thread.html/r9076ab09076ab79cc010defcb61e13ad3e24409f62d6666f512bd873@%3Cdev.flink.apache.org%3E|https://lists.apache.org/x/thread.html/r9076ab09076ab79cc010defcb61e13ad3e24409f62d6666f512bd873@%3Cdev.flink.apache.org%3E]|https://lists.apache.org/x/thread.html/r9076ab09076ab79cc010defcb61e13ad3e24409f62d6666f512bd873@%3Cdev.flink.apache.org%3E] > Refactor the Kubernetes architecture design > ------------------------------------------- > > Key: FLINK-16194 > URL: https://issues.apache.org/jira/browse/FLINK-16194 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes > Affects Versions: 1.10.0 > Reporter: Canbin Zheng > Priority: Critical > Fix For: 1.11.0 > > > So far, Flink has made efforts for the native integration of Kubernetes. > However, it is always essential to evaluate the existing design and consider > alternatives that have better design and are easier to maintain in the long > run. We have suffered from some problems while developing new features base > on the current code. Here is some of them: > # We don’t have a unified monadic-step based orchestrator architecture to > construct all the Kubernetes resources. > ** There are inconsistencies between the orchestrator architecture that > client uses to create the Kubernetes resources, and the orchestrator > architecture that the master uses to create Pods; this confuses new > contributors, as there is a cognitive burden to understand two architectural > philosophies instead of one; for another, maintenance and new feature > development become quite challenging. > ** Pod construction is done in one step. With the introduction of new > features for the Pod, the construction process could become far more > complicated, and the functionality of a single class could explode, which > hurts code readability, writability, and testability. At the moment, we have > encountered such challenges and realized that it is not an easy thing to > develop new features related to the Pod. > ** The implementations of a specific feature are usually scattered in > multiple decoration classes. For example, the current design uses a > decoration class chain that contains five Decorator class to mount a > configuration file to the Pod. If people would like to introduce other > configuration files support, such as Hadoop configuration or Keytab files, > they have no choice but to repeat the same tedious and scattered process. > # We don’t have dedicated objects or tools for centrally parsing, verifying, > and managing the Kubernetes parameters, which has raised some maintenance and > inconsistency issues. > ** There are many duplicated parsing and validating code, including settings > of Image, ImagePullPolicy, ClusterID, ConfDir, Labels, etc. It not only harms > readability and testability but also is prone to mistakes. Refer to issue > FLINK-16025 for inconsistent parsing of the same parameter. > ** The parameters are scattered so that some of the method signatures have > to declare many unnecessary input parameters, such as > FlinkMasterDeploymentDecorator#createJobManagerContainer. > > For solving these issues, we propose to > # Introduce a unified monadic-step based orchestrator architecture that has > a better, cleaner and consistent abstraction for the Kubernetes resources > construction process. > # Add some dedicated tools for centrally parsing, verifying, and managing > the Kubernetes parameters. > > Refer to the design doc for the details, any feedback is welcome. -- This message was sent by Atlassian Jira (v8.3.4#803005)