Hi, I was recently looking at the Flink native Kubernetes integration [1] to get an idea how it relates to existing operator based solutions [2], [3].
Part of the native integration's motivations was simplicity (no extra component to install), but arguably that is also a shortcoming. The k8s operator model can offer support for application lifecycle like upgrade and rescaling, as well as job submission without a Flink client. When using the Flink native integration it would still be necessary to provide that controller functionality. Is the idea to use the native integration for task manager resource allocation in tandem with an operator that provides the external controller functionality? If anyone using the Flink native integration can share experience, I would be curious to learn more about the specific setup and if there are plans to expand the k8s native integration capabilities. For example: * Application upgrade with features such as [4]. Since the job manager is part of the deployment it cannot orchestrate the deployment. It needs to be the responsibility of an external process. Has anyone contemplated adding such a component to Flink itself? * Rescaling: Theoretically a parallelism change could be performed w/o restart of the job manager pod. Hence, building blocks to trigger and apply rescaling could be part of Flink itself. Has this been explored further? Yang kindly pointed me to [5]. Is the recommendation/conclusion that when a k8s operator is already used, then let it be in charge of the task manager resource allocation? If so, what scenario was the native k8s integration originally intended for? Thanks, Thomas [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/native_kubernetes/#deployment-modes [2] https://github.com/lyft/flinkk8soperator [3] https://github.com/spotify/flink-on-k8s-operator [4] https://github.com/lyft/flinkk8soperator/blob/master/docs/state_machine.md [5] https://lists.apache.org/thread/8cn99f6n8nhr07n5vqfo880tpm624s5d