Thanks Thomas for drafting this FLIP, and everyone for the discussion.

I also have a few questions and comments.

## Job Submission
Deploying a Flink session cluster via kubectl & CR and then submitting jobs
to the cluster via Flink cli / REST is probably the approach that requires
the least effort. However, I'd like to point out 2 weaknesses.
1. A lot of users use Flink in perjob/application modes. For these users,
having to run the job in two steps (deploy the cluster, and submit the job)
is not that convenient.
2. One of our motivations is being able to manage Flink applications'
lifecycles with kubectl. Submitting jobs from cli sounds not aligned with
this motivation.
I think it's probably worth it to support submitting jobs via kubectl & CR
in the first version, both together with deploying the cluster like in
perjob/application mode and after deploying the cluster like in session
mode.

## Versioning
Which Flink versions does the operator plan to support?
1. Native K8s deployment was firstly introduced in Flink 1.10
2. Native K8s HA was introduced in Flink 1.12
3. The Pod template support was introduced in Flink 1.13
4. There was some changes to the Flink docker image entrypoint script in,
IIRC, Flink 1.13

## Compatibility
What kind of API compatibility we can commit to? It's probably fine to have
alpha / beta version APIs that allow incompatible future changes for the
first version. But eventually we would need to guarantee backwards
compatibility, so that an early version CR can work with a new version
operator.

Thank you~

Xintong Song



On Fri, Jan 28, 2022 at 1:18 PM Thomas Weise <t...@apache.org> wrote:

> Thanks for the feedback!
>
> >
> > # 1 Flink Native vs Standalone integration
> > Maybe we should make this more clear in the FLIP but we agreed to do the
> > first version of the operator based on the native integration.
> > While this clearly does not cover all use-cases and requirements, it
> seems
> > this would lead to a much smaller initial effort and a nicer first
> version.
> >
>
> I'm also leaning towards the native integration, as long as it reduces the
> MVP effort. Ultimately the operator will need to also support the
> standalone mode. I would like to gain more confidence that native
> integration reduces the effort. While it cuts the effort to handle the TM
> pod creation, some mapping code from the CR to the native integration
> client and config needs to be created. As mentioned in the FLIP, native
> integration requires the Flink job manager to have access to the k8s API to
> create pods, which in some scenarios may be seen as unfavorable.
>
>  > > > # Pod Template
> > > > Is the pod template in CR same with what Flink has already
> > supported[4]?
> > > > Then I am afraid not the arbitrary field(e.g. cpu/memory resources)
> > could
> > > > take effect.
>
> Yes, pod template would look almost identical. There are a few settings
> that the operator will control (and that may need to be blacklisted), but
> in general we would not want to place restrictions. I think a mechanism
> where a pod template is merged from multiple layers would also be
> interesting to make this more flexible.
>
> Cheers,
> Thomas
>

Reply via email to