Hi Biao, # 1 Flink Native vs Standalone integration I think we have got a trend in this discussion[1] that the newly introduced Flink K8s operator will start with native K8s integration first. Do you have some concerns about this?
# 2 K8S StatefulSet v.s. K8S Deployment IIUC, the FlinkDeployment is just a custom resource name. It does not mean that we need to create a corresponding K8s deployment for JobManager or TaskManager. If we are using native K8s integration, the JobManager is started with K8s deployment while TaskManagers are naked pods managed by FlinkResourceManager. Actually, I think "FlinkDeployment" is easier to understand than "FlinkStatefulSet" :) [1]. https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 Best, Yang Biao Geng <biaoge...@gmail.com> 于2022年1月26日周三 18:00写道: > Hi Thomas, > Thanks a lot for the great efforts in this well-organized FLIP! After > reading the FLIP carefully, I think Yang has given some great feedback and > I just want to share some of my concerns: > # 1 Flink Native vs Standalone integration > I believe it is reasonable to support both modes in the long run but in the > FLIP and previous thread[1], it seems that we have not made a decision on > which one to implement initially. The FLIP mentioned "Maybe start with > support for Flink Native" for reusing codes in [2]. Is it the selected one > finally? > # 2 K8S StatefulSet v.s. K8S Deployment > In the CR Example, I notice that the kind we use is FlinkDeployment. I > would like to check if we have made the decision to use K8S Deployment > workload resource. As the name implies, StatefulSet is for stateful apps > while Deployment is usually for stateless apps. I think it is worthwhile to > consider the choice more carefully due to some user case in gcp > operator[3], which may influence our other design choices(like the Flink > application deletion strategy). > > Again, thanks for the work and I believe this FLIP is pretty useful for > many customers and I hope I can make some contributions to this FLIP impl! > > Best regard, > Biao Geng > > [1] https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 > [2] https://github.com/wangyang0918/flink-native-k8s-operator > [3] https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/pull/354 > > Yang Wang <danrtsey...@gmail.com> 于2022年1月26日周三 15:25写道: > > > Thanks Thomas for creating FLIP-212 to introduce the Flink Kubernetes > > Operator. > > > > The proposal looks already very good to me and has integrated all the > input > > in the previous discussion(e.g. native K8s VS standalone, Go VS java). > > > > I read the FLIP carefully and have some questions that need to be > > clarified. > > > > # How do we run a Flink job from a CR? > > 1. Start a session cluster and then followed by submitting the Flink job > > via rest API > > 2. Start a Flink application cluster which bundles one or more Flink jobs > > It is not clear enough to me which way we will choose. It seems that the > > existing google/lyft K8s operator is using #1. But I lean to #2 in the > new > > introduced K8s operator. > > If #2 is the case, how could we get the job status when it finished or > > failed? Maybe FLINK-24113[1] and FLINK-25715[2] could help. Or we may > need > > to enable the Flink history server[3]. > > > > > > # ApplicationDeployer Interface or "flink run-application" / > > "kubernetes-session.sh" > > How do we start the Flink application or session cluster? > > It will be great if we have the public and stable interfaces for > deployment > > in Flink. But currently we only have an internal interface > > *ApplicationDeployer* to deploy the application cluster and > > no interfaces for deploying session cluster. > > Of cause, we could also use the CLI command for submission. However, it > > will have poor performance when launching multiple applications. > > > > > > # Pod Template > > Is the pod template in CR same with what Flink has already supported[4]? > > Then I am afraid not the arbitrary field(e.g. cpu/memory resources) could > > take effect. > > > > > > [1]. https://issues.apache.org/jira/browse/FLINK-24113 > > [2]. https://issues.apache.org/jira/browse/FLINK-25715 > > [3]. > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/advanced/historyserver/ > > [4]. > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/resource-providers/native_kubernetes/#pod-template > > > > > > > > Best, > > Yang > > > > > > Thomas Weise <t...@apache.org> 于2022年1月25日周二 13:08写道: > > > > > Hi, > > > > > > As promised in [1] we would like to start the discussion on the > > > addition of a Kubernetes operator to the Flink project as FLIP-212: > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator > > > > > > Please note that the FLIP is currently focussed on the overall > > > direction; the intention is to fill in more details once we converge > > > on the high level plan. > > > > > > Thanks and looking forward to a lively discussion! > > > > > > Thomas > > > > > > [1] https://lists.apache.org/thread/l1dkp8v4bhlcyb4tdts99g7w4wdglfy4 > > > > > >