It seems that we could benefit a lot from the "Parameters Parser". Let's start to add the dedicated jobmanager/taskmanager config parser classes.
Best, Yang Canbin Zheng <felixzhen...@gmail.com> 于2020年2月24日周一 上午10:39写道: > Hi, Yang Wang, > > Thanks for the feedback. > > > Parameters Parser > > I can think of many benefits of the dedicated parameter parsing/verifying > tools, here is some of them: > 1. Reuse the parameters parsing and verifying code. > 2. Ensure consistent handling logic for the same setting. > 3. Simplify some of the method's signature because we can avoid defining > too many unnecessary input parameters. > > > BTW, the fabric8 kubernetes-client and Kubernetes api server already has > the parameters check before starting to create the resources. I think the > exceptions > are usually meaning and enough for the users to get the root cause. > > 1. On the one hand, there are gaps between the declarative model used by > Kubernetes and the configuration model used by Flink. Indeed, the > Kubernetes client/server helps check the Kubernetes side parameters and > throw exceptions in case of failures; however, the exception messages are > not always easily understood from the perspective of Flink users. > Independent of the solutions, what we should make sure is that the > exception information in the Flink configuration model is meaningful enough > for the Flink users. > 2. On the other hand, some prechecking is beneficial in some scenarios > since we can avoid useless effort on Kubernetes resource creation in case > of failures leading to clean up all the created resources. > > > Regards, > Canbin Zheng > > Yang Wang <danrtsey...@gmail.com> 于2020年2月23日周日 下午10:37写道: > >> > The new introduced decorator >> After some offline discussion with Canbin and tison, i totally understand >> the evolved decorator design. Each decorator will be self-contained and >> is responsible for just one thing. Currently, if we want to mount a new >> config >> file to jobmanager/taskmanager, then both `ConfigMapDecorator`, >> `JobManagerDecorator`, `TaskManagerDecorator` needs to be updated. >> It is not convenient for the new contributors to do this. In the new >> design, >> by leveraging the accompanying kubernetes resources in decorator, we >> could finish the creating and mounting config map in one decorator. >> >> Since now we just have a basic implementation for native Kubernetes >> integration and lots of features need to be developed. And many users >> want to participate in and contribute to the integration. So i agree to >> refactor >> the current decorator implementation and make it easier for the new >> contributor. >> >> For the detailed divergences(naming, etc.), i think we could discuss in >> the PR. >> >> > Parameters Parser >> Currently, the decorator directly use Flink configuration to get the >> parameters >> to build the Kubernetes resource. It is straightforward, however we could >> not >> have a unified parameters check. So i am not sure whether you will >> introduce >> a tool to check the parameters or just simply have our own basic check. >> >> BTW, the fabric8 kubernetes-client and Kubernetes api server already has >> the parameters check before starting to create the resources. I think the >> exceptions >> are usually meaning and enough for the users to get the root cause. >> >> >> >> Best, >> Yang >> >> >> >> >> felixzheng zheng <felixzhen...@gmail.com> 于2020年2月22日周六 上午10:54写道: >> >>> Great thanks for the quick feedback Till. You are right; it is not a >>> fundamentally different approach compared to >>> what we have right now, all the Kubernetes resources created are the >>> same, >>> we aim to evolve the existing decorator approach so that, >>> 1. the decorators are monadic and smaller in size and functionality. >>> 2. the new decorator design allows reusing the decorators between the >>> client and the cluster as much as possible. >>> 3. all the decorators are independent with each other, and they could >>> have >>> arbitrary order in the chain, they share the same APIs and follow a >>> unified >>> orchestrator architecture so that new developers could quickly understand >>> what should be done to introduce a new feature. >>> >>> Besides that, the new approach allows us adding tests for every decorator >>> alone instead of doing a final test of all the decorators in the >>> Fabric8ClientTest.java. >>> >>> Cheers, >>> Canbin Zheng >>> >>> Till Rohrmann <trohrm...@apache.org> 于2020年2月22日周六 上午12:28写道: >>> >>> > Thanks for starting this discussion Canbin. If I understand your >>> proposal >>> > correctly, then you would like to evolve the existing decorator >>> approach so >>> > that decorators are monadic and smaller in size and functionality. The >>> > latter aspect will allow to reuse them between the client and the >>> cluster. >>> > Just to make sure, it is not a fundamentally different approach >>> compared to >>> > what we have right now, is it? >>> > >>> > If this is the case, then I think it makes sense to reuse code as much >>> as >>> > possible and to create small code units which are easier to test. >>> > >>> > Cheers, >>> > Till >>> > >>> > On Fri, Feb 21, 2020 at 4:41 PM felixzheng zheng < >>> felixzhen...@gmail.com> >>> > wrote: >>> > >>> > > Thanks for the feedback @Yang Wang. I would like to discuss some of >>> the >>> > > details in depth about why I am confused about the existing design. >>> > > >>> > > Question 1: How do we mount a configuration file? >>> > > >>> > > For the existing design, >>> > > >>> > > 1. >>> > > >>> > > We need several classes to finish it: >>> > > 1. >>> > > >>> > > InitializerDecorator >>> > > 2. >>> > > >>> > > OwnerReferenceDecorator >>> > > 3. >>> > > >>> > > ConfigMapDecorator >>> > > 4. >>> > > >>> > > KubernetesUtils: providing the getConfigMapVolume method to >>> share >>> > for >>> > > the FlinkMasterDeploymentDecorator and the >>> TaskManagerPodDecorator. >>> > > 5. >>> > > >>> > > FlinkMasterDeploymentDecorator: mounts the ConfigMap volume. >>> > > 6. >>> > > >>> > > TaskManagerPodDecorator: mounts the ConfigMap volume. >>> > > 7. >>> > > >>> > > If in the future, someone would like to introduce an init >>> > Container, >>> > > the InitContainerDecorator has to mount the ConfigMap volume >>> too. >>> > > >>> > > >>> > > I am confused about the current solution to mounting a configuration >>> > file: >>> > > >>> > > 1. >>> > > >>> > > Actually, we do not need so many Decorators for mounting a file. >>> > > 2. >>> > > >>> > > If we would like to mount a new file, we have no choice but to >>> repeat >>> > > the same tedious and scattered routine. >>> > > 3. >>> > > >>> > > There’s no easy way to test the file mounting functionality >>> alone; we >>> > > have to construct the ConfigMap, the Deployment or the >>> TaskManagerPod >>> > > first >>> > > and then do a final test. >>> > > >>> > > >>> > > The reason why it is so complex to mount a configuration file is >>> that we >>> > > don’t fully consider the internal connections among those resources >>> in >>> > the >>> > > existing design. >>> > > >>> > > The new abstraction we proposed could solve such a kind of problem, >>> the >>> > new >>> > > Decorator object is as follows: >>> > > >>> > > public interface KubernetesStepDecorator { >>> > > >>> > > /** >>> > > >>> > > * Apply transformations to the given FlinkPod in accordance with >>> this >>> > > feature. This can include adding >>> > > >>> > > * labels/annotations, mounting volumes, and setting startup >>> command or >>> > > parameters, etc. >>> > > >>> > > */ >>> > > >>> > > FlinkPod decorateFlinkPod(FlinkPod flinkPod); >>> > > >>> > > /** >>> > > >>> > > * Build the accompanying Kubernetes resources that should be >>> > introduced >>> > > to support this feature. This could >>> > > >>> > > * only applicable to the client-side submission process. >>> > > >>> > > */ >>> > > >>> > > List<HasMetadata> buildAccompanyingKubernetesResources() throws >>> > > IOException; >>> > > >>> > > } >>> > > >>> > > The FlinkPod is a composition of the Pod, the main Container, the >>> init >>> > > Container, and the sidecar Container. >>> > > >>> > > Next, we introduce a KubernetesStepDecorator implementation, the >>> method >>> > of >>> > > buildAccompanyingKubernetesResources creates the corresponding >>> ConfigMap, >>> > > and the method of decorateFlinkPod configures the Volume for the Pod >>> and >>> > > all the Containers. >>> > > >>> > > So, for the scenario of mounting a configuration file, the >>> advantages of >>> > > this new architecture are as follows: >>> > > >>> > > 1. >>> > > >>> > > One dedicated KubernetesStepDecorator implementation is enough to >>> > finish >>> > > all the mounting work, meanwhile, this class would be shared >>> between >>> > the >>> > > client-side and the master-side. Besides that, the number of >>> lines of >>> > > code >>> > > in that class will not exceed 300 lines which facilitate code >>> > > readability >>> > > and maintenance. >>> > > 2. >>> > > >>> > > Testing becomes an easy thing now, as we can add a dedicated test >>> > class >>> > > for only this class, the test would never rely on the >>> construction of >>> > > other >>> > > components such as the Deployment. >>> > > 3. >>> > > >>> > > It’s quite convenient to mount a new configuration file via the >>> newly >>> > > dedicated KubernetesStepDecorator implementation. >>> > > >>> > > >>> > > Question 2: How do we construct the Pod? >>> > > >>> > > For the existing design, >>> > > >>> > > 1. >>> > > >>> > > The FlinkMasterDeploymentDecorator is responsible for building the >>> > > Deployment and configuring the Pod and the Containers, while the >>> > > TaskManagerPodDecorator is responsible for building the >>> TaskManager >>> > Pod. >>> > > Take FlinkMasterDeploymentDecorator as an example, let’s see what >>> it >>> > has >>> > > done. >>> > > 1. >>> > > >>> > > Configure the main Container, including the name, command, >>> args, >>> > > image, image pull policy, resource requirements, ports, all >>> kinds >>> > of >>> > > environment variables, all kinds of volume mounts, etc. >>> > > 2. >>> > > >>> > > Configure the Pod, including service account, all kinds of >>> volumes, >>> > > and attach all kinds of Container, including the main >>> Container, >>> > the >>> > > init >>> > > Container, and the sidecar Container. >>> > > 3. >>> > > >>> > > Configure the Deployment. >>> > > >>> > > >>> > > >>> > > 1. >>> > > >>> > > The InitializerDecorator and the OwnerReferenceDecorator have >>> basic >>> > > logic so that the most complex work is completed in the >>> > > FlinkMasterDeploymentDecorator and the TaskManagerPodDecorator. >>> With >>> > the >>> > > introduction of new features for the Pod, such as customized >>> volume >>> > > mounts, >>> > > Hadoop configuration support, Kerberized HDFS support, secret >>> mounts, >>> > > Python support, etc. the construction process could become far >>> more >>> > > complicated, and the functionality of a single class could >>> explode, >>> > > which >>> > > hurts code readability, writability, and testability. Besides >>> that, >>> > both >>> > > the client-side and the master-side shares much of the Pod >>> > construction >>> > > logic. >>> > > >>> > > >>> > > So the problems are as follows: >>> > > >>> > > 1. >>> > > >>> > > We don’t have a consistent abstraction that is applicable to both >>> the >>> > > client-side and the master-side for Pod construction (including >>> the >>> > > Containers) so that we have to share some of the code via tool >>> classes >>> > > which is a trick solution, however, we can’t share most of the >>> code as >>> > > much >>> > > as possible. >>> > > 2. >>> > > >>> > > We don’t have a step-by-step orchestrator architecture to help >>> the Pod >>> > > construction. >>> > > >>> > > >>> > > For the proposed design, >>> > > >>> > > 1. >>> > > >>> > > The problems above are all solved: we have a consistent >>> abstraction >>> > that >>> > > leads to a monadic-step orchestrator architecture to construct >>> the Pod >>> > > step >>> > > by step. One step is responsible for exactly one thing, following >>> that >>> > > is >>> > > the fact that every step is well testable, because each unit can >>> be >>> > > considerably small, and the number of branches to test in any >>> given >>> > > step is >>> > > limited. Moreover, much of the step could be shared between the >>> > > client-side >>> > > and the master-side, such as configuration files mount, etc. >>> > > >>> > > >>> > > Question 3: Could all the resources share the same orchestrator >>> > > architecture? >>> > > >>> > > For the existing design, >>> > > >>> > > 1. >>> > > >>> > > We don’t have a unified orchestrator architecture for the >>> construction >>> > > of all the Kubernetes resources, therefore, we need a Decorator >>> chain >>> > > for >>> > > every Kubernetes resource. >>> > > >>> > > >>> > > For the proposed design, >>> > > >>> > > 1. >>> > > >>> > > Following Question 1 ~ 2 and the design doc[1], we have >>> introduced a >>> > > monadic-step orchestrator architecture to construct the Pod and >>> all >>> > the >>> > > Containers. Besides that, this architecture also works for the >>> other >>> > > resources, such as the Services and the ConfigMaps. For example, >>> if we >>> > > need >>> > > a new Service or a ConfigMap, just introduce a new step. >>> > > >>> > > >>> > > >>> > > Question 4: How do we introduce the InitContainer or the side-car >>> > > Container? >>> > > >>> > > For the existing design, >>> > > >>> > > 1. >>> > > >>> > > Both the client-side and the master-side introduce some new >>> > Decorators, >>> > > >>> > > the client-side chain could be: >>> > > >>> > > InitializerDecorator -> OwnerReferenceDecorator -> >>> > > FlinkMasterDeploymentDecorator -> InitContainerDecorator -> >>> > > SidecarDecorator -> etc >>> > > >>> > > - >>> > > >>> > > and the master-side could be: >>> > > >>> > > InitializerDecorator -> OwnerReferenceDecorator -> >>> > TaskManagerPodDecorator >>> > > -> InitContainerDecorator -> SidecarDecorator -> etc >>> > > >>> > > As we can see, the FlinkMasterDeploymentDecorator or the >>> > > TaskManagerPodDecorator is designed for the Pod construction, >>> including >>> > the >>> > > Containers, so we don’t need to treat the init Container and the >>> sidecar >>> > > Container as special cases different from the main Container by >>> > introducing >>> > > a dedicated Decorator for every one of them. Such kind of trick >>> solution >>> > > confuses me, maybe to the other developers. >>> > > >>> > > For the proposed design, >>> > > >>> > > 1. >>> > > >>> > > Following Question 1 ~ 4 and the design doc [1], the central >>> premise >>> > of >>> > > the proposed orchestrator architecture is that the Pod, the main >>> > > Container, >>> > > the init Container, the sidecar Container, the Services, and the >>> > > ConfigMaps >>> > > all sit on an equal footing. For every one of the Pod, the main >>> > > Container, >>> > > the init Container and the sidecar Container, people could >>> introduce >>> > any >>> > > number of steps to finish its construction; we attach all the >>> > > Containers to >>> > > the Pod in the Builder tool classes after the orchestrator >>> > architecture >>> > > constructs them. >>> > > >>> > > >>> > > Question 5: What is the relation between the Decorators? >>> > > >>> > > For the existing design, >>> > > >>> > > 1. >>> > > >>> > > The Decorators are not independent; most of them have a strict >>> order >>> > in >>> > > the chain. >>> > > 2. >>> > > >>> > > The Decorators do not share common APIs in essence, as their input >>> > type >>> > > could be different so that we can’t finish the construction of a >>> > > Kubernetes >>> > > resource such as the Deployment and the TaskManager Pod through a >>> > > one-time >>> > > traversal, there are boxing and unboxing overheads among some of >>> the >>> > > neighboring Decorators. >>> > > >>> > > >>> > > For the proposed design, >>> > > >>> > > 1. >>> > > >>> > > The steps are completely independent so that they could have >>> arbitrary >>> > > order in the chain; The only rule is that the Hadoop configuration >>> > > Decorator should be the last node in the chain. >>> > > 2. >>> > > >>> > > All the steps share common APIs, with the same input type of all >>> the >>> > > methods, so we can construct a Kubernetes resource via a one-time >>> > > traversal. >>> > > >>> > > >>> > > Question 6: What is the number of Decorators chains? >>> > > >>> > > For the existing design, >>> > > >>> > > 1. >>> > > >>> > > People have to introduce a new chain once they introduce a new >>> > > Kubernetes resource. For example, a new ConfigMap Decorator chain >>> for >>> > > mounting a new configuration file. At this moment, we already have >>> > five >>> > > Decorator chains. >>> > > >>> > > >>> > > For the proposed design, >>> > > >>> > > 1. >>> > > >>> > > There are always two chains, one is for constructing all the >>> > Kubernetes >>> > > resources on the client-side, including the JobManager Pod, the >>> > > Containers, >>> > > the ConfigMap(s), and the Services(s); the other one is for >>> > constructing >>> > > the TaskManager Pod and the Containers. >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > Yang Wang <danrtsey...@gmail.com> 于2020年2月21日周五 下午2:05写道: >>> > > >>> > > > Hi Canbing, >>> > > > >>> > > > >>> > > > Thanks a lot for sharing your thoughts to improve the Flink on K8s >>> > native >>> > > > integration. >>> > > > Frankly speaking, your discussion title confuses me and i am >>> wondering >>> > > > whether you >>> > > > want to refactor the whole design. However, after i dive into the >>> > details >>> > > > and the provided >>> > > > document, i realize that mostly you want to improve the >>> implementation. >>> > > > >>> > > > >>> > > > Regarding your two main points. >>> > > > >>> > > > >> Introduce a unified monadic-step based orchestrator architecture >>> > that >>> > > > has a better, >>> > > > cleaner and consistent abstraction for the Kubernetes resources >>> > > > construction process, >>> > > > both applicable to the client side and the master side. >>> > > > >>> > > > When i introduce the decorator for the K8s in Flink, there is >>> always a >>> > > > guideline in my mind >>> > > > that it should be easy for extension and adding new features. Just >>> as >>> > you >>> > > > say, we have lots >>> > > > of functions to support and the K8s is also evolving very fast. The >>> > > current >>> > > > `ConfigMapDecorator`, >>> > > > `FlinkMasterDeploymentDecorator`, `TaskManagerPodDecorator` is a >>> basic >>> > > > implementation with >>> > > > some prerequisite parameters. Of course we could chain more >>> decorators >>> > to >>> > > > construct the K8s >>> > > > resources. For example, InitializerDecorator -> >>> OwnerReferenceDecorator >>> > > -> >>> > > > FlinkMasterDeploymentDecorator -> InitContainerDecorator -> >>> > > > SidecarDecorator -> etc. >>> > > > >>> > > > I am little sceptical about splitting every parameter into a single >>> > > > decorator. Since it does >>> > > > not take too much benefits. But i agree with moving some common >>> > > parameters >>> > > > into separate >>> > > > decorators(e.g. volume mount). Also introducing the `~Builder` >>> class >>> > and >>> > > > spinning off the chaining >>> > > > decorator calls from `Fabric8FlinkKubeClient` make sense to me. >>> > > > >>> > > > >>> > > > >>> > > > >> Add some dedicated tools for centrally parsing, verifying, and >>> > > managing >>> > > > the Kubernetes parameters. >>> > > > >>> > > > Currently, we always get the parameters directly from Flink >>> > > configuration( >>> > > > e.g. >>> > `flinkConfig.getString(KubernetesConfigOptions.CONTAINER_IMAGE)`). I >>> > > > think it could be improved >>> > > > by introducing some dedicated conf parser classes. It is a good >>> idea. >>> > > > >>> > > > >>> > > > Best, >>> > > > Yang >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > felixzheng zheng <felixzhen...@gmail.com> 于2020年2月21日周五 上午9:31写道: >>> > > > >>> > > > > Hi everyone, >>> > > > > >>> > > > > I would like to kick off a discussion on refactoring the existing >>> > > > > Kubernetes resources construction architecture design. >>> > > > > >>> > > > > I created a design document [1] that clarifies our motivation to >>> do >>> > > this >>> > > > > and some improvement proposals for the new design. >>> > > > > >>> > > > > Briefly, we would like to >>> > > > > 1. Introduce a unified monadic-step based orchestrator >>> architecture >>> > > that >>> > > > > has a better, cleaner and consistent abstraction for the >>> Kubernetes >>> > > > > resources construction process, both applicable to the client >>> side >>> > and >>> > > > the >>> > > > > master side. >>> > > > > 2. Add some dedicated tools for centrally parsing, verifying, and >>> > > > managing >>> > > > > the Kubernetes parameters. >>> > > > > >>> > > > > It would be great to start the efforts before adding big >>> features for >>> > > the >>> > > > > native Kubernetes submodule, and Tison and I plan to work >>> together >>> > for >>> > > > this >>> > > > > issue. >>> > > > > >>> > > > > Please let me know your thoughts. >>> > > > > >>> > > > > Regards, >>> > > > > Canbin Zheng >>> > > > > >>> > > > > [1] >>> > > > > >>> > > > > >>> > > > >>> > > >>> > >>> https://docs.google.com/document/d/1dFBjqho8IRyNWxKVhFnTf0qGCsgp72aKON4wUdHY5Pg/edit?usp=sharing >>> > > > > [2] https://issues.apache.org/jira/browse/FLINK-16194 >>> > > > > >>> > > > >>> > > >>> > >>> >>