Good catch, Yang Wang and Gyula on the Java version. I personally prefer
that we simply can not support Java 8 for the operator, since it is a net
new project we are better off starting support at Java 11 right away.

As Gyula outlined above, it is important to note that it only affects the
operator (and the operator container image), not existing or new Flink jobs.

On Tue, Feb 15, 2022 at 1:50 PM Gyula Fóra <gyula.f...@gmail.com> wrote:

> Hi Devs,
>
> Yang Wang discovered that the current prototype is not compatible with Java
> 8 but only 11 and upwards.
>
> The reason for this is that the java operator SDK itself is not java 8
> compatible unfortunately.
>
> Given that Java 8 is on the road to deprecation and that the operator runs
> as a containerized deployment, are there any concerns regarding making the
> target java version 11?
> This should not affect deployed flink clusters and jobs, those should still
> work with Java 8, but only the kubernetes operator itself.
>
> Cheers,
> Gyula
>
>
> On Tue, Feb 15, 2022 at 1:06 PM Yang Wang <danrtsey...@gmail.com> wrote:
>
> > I also lean to not introduce the savepoint/checkpoint related fields to
> the
> > job spec, especially in the very beginning of flink-kubernetes-operator.
> >
> >
> > Best,
> > Yang
> >
> > Gyula Fóra <gyula.f...@gmail.com> 于2022年2月15日周二 19:02写道:
> >
> > > Hi Peng Yuan!
> > >
> > > While I do agree that savepoint path is a very important production
> > > configuration there are a lot of other things that come to my mind:
> > >  - savepoint dir
> > >  - checkpoint dir
> > >  - checkpoint interval/timeout
> > >  - high availability settings (provider/storagedir etc)
> > >
> > > just to name a few...
> > >
> > > While these are all production critical, they have nice clean Flink
> > config
> > > settings to go with them. If we stand introducing these to jobspec we
> > only
> > > get confusion about priority order etc and it is going to be hard to
> > change
> > > or remove them in the future. In any case we should validate that these
> > > configs exist in cases where users use a stateful upgrade mode for
> > example.
> > > This is something we need to add for sure.
> > >
> > > As for the other options you mentioned like automatic savepoint
> > generation
> > > for instance, those deserve an independent discussion of their own I
> > > believe :)
> > >
> > > Cheers,
> > > Gyula
> > >
> > > On Tue, Feb 15, 2022 at 11:23 AM K Fred <yuanpengf...@gmail.com>
> wrote:
> > >
> > > > Hi Matyas!
> > > >
> > > > Thanks for your reply!
> > > > For 1. and 3. scenarios,I couldn't agree more with the podTemplate
> > > solution
> > > > , i missed this part.
> > > > For savepoint related configuration, I think it's very important to
> be
> > > > specified in JobSpec, Because savepoint is a very common
> configuration
> > > for
> > > > upgrading a job, if it has been placed in JobSpec can be obviously
> > > > configured by the user. In addition, other advanced properties can be
> > put
> > > > into flinkConfiguration customized by expert users.
> > > > A bunch of savepoint configuration as follows:
> > > >
> > > > > fromSavepoint——Job restart from
> > > >
> > > > autoSavepointSecond—— Automatically take a savepoint to the
> > > `savepointsDir`
> > > > > every n seconds.
> > > >
> > > > savepointsDir—— Savepoints dir where to store automatically taken
> > > > > savepoints
> > > >
> > > > savepointGeneration—— Update savepoint generation of job status for a
> > > > > running job (should be defined in JobStatus)
> > > >
> > > >
> > > > Best wishes,
> > > > Peng Yuan.
> > > >
> > > > On Tue, Feb 15, 2022 at 4:41 PM Őrhidi Mátyás <
> matyas.orh...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Peng,
> > > > >
> > > > > Thanks for your feedback. Regarding 1. and 3. scenarios, the
> > > podTemplate
> > > > > functionality in the operator could cover both. We also need to be
> > > > careful
> > > > > about introducing proxy parameters in the CRD spec. The savepoint
> > path
> > > is
> > > > > usually accompanied with a bunch of other configurations for
> example,
> > > so
> > > > > users need to use configuration params anyway. What do you think?
> > > > >
> > > > > Best,
> > > > > Matyas
> > > > >
> > > > > On Tue, Feb 15, 2022 at 8:58 AM K Fred <yuanpengf...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi Gyula!
> > > > > >
> > > > > > I have reviewed the prototype design of flink-kubernetes-operator
> > you
> > > > > > submitted, and I have the following questions:
> > > > > >
> > > > > > 1.Can a Flink Jar package that supports pulling from the sidecar
> be
> > > > added
> > > > > > to the JobSpec? just like this:
> > > > > >
> > > > > > > initContainers:
> > > > > > >       - name: downloader
> > > > > > >         image: curlimages/curl
> > > > > > >         env:
> > > > > > >           - name: JAR_URL
> > > > > > >             value:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.14.3/flink-examples-streaming_2.12-1.14.3-WordCount.jar
> > > > > > >           - name: DEST_PATH
> > > > > > >             value: /cache/flink-app.jar
> > > > > > >         command: ['sh', '-c', 'curl -o ${DEST_PATH}
> ${JAR_URL}']
> > > > > >
> > > > > > 2.Can we add savepoint path property to job specification?
> > > > > > 3.Can we add an extra port to the JobManagerSpec and
> > TaskManagerSpec
> > > to
> > > > > > expose some service ,such as prometheus?The property can be this:
> > > > > >
> > > > > > > extraPorts:
> > > > > > >       - name: prom
> > > > > > >         containerPort: 9249
> > > > > >
> > > > > >
> > > > > >
> > > > > > Best wishes,
> > > > > > Peng Yuan
> > > > > >
> > > > > > On Tue, Feb 15, 2022 at 12:23 AM Gyula Fóra <gyf...@apache.org>
> > > wrote:
> > > > > >
> > > > > > > Hi Flink Devs!
> > > > > > >
> > > > > > > We would like to present to you the first prototype of the
> > > > > > > flink-kubernetes-operator that was built based on the FLIP and
> > the
> > > > > > > discussion on this mail thread. We would also like to call out
> > some
> > > > > > design
> > > > > > > decisions that we have made regarding architecture components
> > that
> > > > were
> > > > > > not
> > > > > > > explicitly mentioned in the FLIP document/thread and give you
> the
> > > > > > > opportunity to raise any concerns here.
> > > > > > >
> > > > > > > You can find the initial prototype here:
> > > > > > > https://github.com/apache/flink-kubernetes-operator/pull/1
> > > > > > >
> > > > > > > We will leave the PR open for 1-2 days before merging to let
> > people
> > > > > > comment
> > > > > > > on it, but please be mindful that this is an initial prototype
> > with
> > > > > many
> > > > > > > rough edges. It is not intended to be a complete implementation
> > of
> > > > the
> > > > > > FLIP
> > > > > > > specs as that will take some more work from all of us :)
> > > > > > >
> > > > > > >
> > > > > > > *Prototype feature set:*The prototype contains a basic working
> > > > version
> > > > > of
> > > > > > > the flink-kubernetes-operator that supports deployment and
> > > lifecycle
> > > > > > > management of a stateful native flink application. We have
> basic
> > > > > support
> > > > > > > for stateful and stateless upgrades, UI ingress, pod templates
> > etc.
> > > > > Error
> > > > > > > handling at this point is largely missing.
> > > > > > >
> > > > > > >
> > > > > > > *Features / design decisions that were not explicitly discussed
> > in
> > > > this
> > > > > > > thread*
> > > > > > >
> > > > > > > *Basic Admission control using a Webhook*Standard resource
> > > admission
> > > > > > > control in Kubernetes to validate and potentially reject
> > resources
> > > is
> > > > > > done
> > > > > > > through Webhooks.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/
> > > > > > > This is a necessary mechanism to give the user an upfront error
> > > when
> > > > an
> > > > > > > incorrect resource was submitted. In the Flink operator's case
> we
> > > > need
> > > > > to
> > > > > > > validate that the FlinkDeployment yaml actually makes sense and
> > > does
> > > > > not
> > > > > > > contain erroneous config options that would inevitably lead to
> > > > > > > deployment/job failures.
> > > > > > >
> > > > > > > We have implemented a simple webhook that we can use for this
> > type
> > > of
> > > > > > > validation, as a separate maven module
> > (flink-kubernetes-webhook).
> > > > The
> > > > > > > webhook is an optional component and can be enabled or disabled
> > > > during
> > > > > > > deployment. To avoid pulling in new external dependencies we
> have
> > > > used
> > > > > > the
> > > > > > > Flink Shaded Netty module to build the simple rest endpoint
> > > required.
> > > > > If
> > > > > > > the community feels that Netty adds unnecessary complexity to
> the
> > > > > webhook
> > > > > > > implementation we are open to alternative backends such as
> > > Springboot
> > > > > for
> > > > > > > instance which would practically eliminate all the boilerplate.
> > > > > > >
> > > > > > >
> > > > > > > *Helm Chart for deployment*Helm charts provide an industry
> > standard
> > > > way
> > > > > > of
> > > > > > > managing kubernetes deployments. We have created a helm chart
> > > > prototype
> > > > > > > that can be used to deploy the operator together with all
> > required
> > > > > > > resources. The helm chart allows easy configuration for things
> > like
> > > > > > images,
> > > > > > > namespaces etc and flags to control specific parts of the
> > > deployment
> > > > > such
> > > > > > > as RBAC or the webhook.
> > > > > > >
> > > > > > > The helm chart provided is intended to be a first version that
> > > worked
> > > > > for
> > > > > > > us during development but we expect to have a lot of iterations
> > on
> > > it
> > > > > > based
> > > > > > > on the feedback from the community.
> > > > > > >
> > > > > > > *Acknowledgment*
> > > > > > > We would like to thank everyone who has provided support and
> > > valuable
> > > > > > > feedback on this FLIP.
> > > > > > > We would also like to thank Yang Wang & Alexis Sarda-Espinosa
> > > > > > specifically
> > > > > > > for making their operators open source and available to us
> which
> > > had
> > > > a
> > > > > > big
> > > > > > > impact on the FLIP and the prototype.
> > > > > > >
> > > > > > > We are looking forward to continuing development on the
> operator
> > > > > together
> > > > > > > with the broader community.
> > > > > > > All work will be tracked using the ASF Jira from now on.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Gyula
> > > > > > >
> > > > > > > On Mon, Feb 14, 2022 at 9:21 AM K Fred <yuanpengf...@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Hi Gyula,
> > > > > > > >
> > > > > > > > Thanks!
> > > > > > > > It's great to see the project getting started and I can't
> wait
> > to
> > > > see
> > > > > > the
> > > > > > > > PR and start contributing code.😄😄😄
> > > > > > > >
> > > > > > > > Best Wishes!
> > > > > > > > Peng Yuan
> > > > > > > >
> > > > > > > > On Mon, Feb 14, 2022 at 4:14 PM Gyula Fóra <
> > gyula.f...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Peng Yuan!
> > > > > > > > >
> > > > > > > > > The repo is already created:
> > > > > > > > > https://github.com/apache/flink-kubernetes-operator
> > > > > > > > >
> > > > > > > > > We will open the PR with the initial prototype later today,
> > > stay
> > > > > > tuned
> > > > > > > in
> > > > > > > > > this thread! :)
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Gyula
> > > > > > > > >
> > > > > > > > > On Mon, Feb 14, 2022 at 9:09 AM K Fred <
> > yuanpengf...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > > Has the project of flink-kubernetes-operator been created
> > in
> > > > > > github?
> > > > > > > > > >
> > > > > > > > > > Peng Yuan
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 9, 2022 at 1:23 AM Gyula Fóra <
> > > > gyula.f...@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I agree with flink-kubernetes-operator as the repo name
> > :)
> > > > > > > > > > > Don't have any better idea
> > > > > > > > > > >
> > > > > > > > > > > Gyula
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Feb 5, 2022 at 2:41 AM Thomas Weise <
> > > t...@apache.org>
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the continued feedback and discussion.
> Looks
> > > > like
> > > > > we
> > > > > > > are
> > > > > > > > > > > > ready to start a VOTE, I will initiate it shortly.
> > > > > > > > > > > >
> > > > > > > > > > > > In parallel it would be good to find the repository
> > name.
> > > > > > > > > > > >
> > > > > > > > > > > > My suggestion would be: flink-kubernetes-operator
> > > > > > > > > > > >
> > > > > > > > > > > > I thought "flink-operator" could be a bit misleading
> > > since
> > > > > the
> > > > > > > term
> > > > > > > > > > > > operator already has a meaning in Flink.
> > > > > > > > > > > >
> > > > > > > > > > > > I also considered "flink-k8s-operator" but that would
> > be
> > > > > almost
> > > > > > > > > > > > identical to existing operator implementations and
> > could
> > > > lead
> > > > > > to
> > > > > > > > > > > > confusion in the future.
> > > > > > > > > > > >
> > > > > > > > > > > > Thoughts?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Thomas
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 4, 2022 at 5:15 AM Gyula Fóra <
> > > > > > gyula.f...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi Danny,
> > > > > > > > > > > > >
> > > > > > > > > > > > > So far we have been focusing our dev efforts on the
> > > > initial
> > > > > > > > native
> > > > > > > > > > > > > implementation with the team.
> > > > > > > > > > > > > If the discussion and vote goes well for this FLIP
> we
> > > are
> > > > > > > looking
> > > > > > > > > > > forward
> > > > > > > > > > > > > to contributing the initial version sometime next
> > week
> > > > > > (fingers
> > > > > > > > > > > crossed).
> > > > > > > > > > > > >
> > > > > > > > > > > > > At that point I think we can already start the dev
> > work
> > > > to
> > > > > > > > support
> > > > > > > > > > the
> > > > > > > > > > > > > standalone mode as well, especially if you can
> > dedicate
> > > > > some
> > > > > > > > effort
> > > > > > > > > > to
> > > > > > > > > > > > > pushing that side.
> > > > > > > > > > > > > Working together on this sounds like a great idea
> and
> > > we
> > > > > > should
> > > > > > > > > start
> > > > > > > > > > > as
> > > > > > > > > > > > > soon as possible! :)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > Gyula
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Feb 4, 2022 at 2:07 PM Danny Cranmer <
> > > > > > > > > > dannycran...@apache.org>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I have been discussing this one with my team. We
> > are
> > > > > > > interested
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > Standalone mode, and are willing to contribute
> > > towards
> > > > > the
> > > > > > > > > > > > implementation.
> > > > > > > > > > > > > > Potentially we can work together to support both
> > > modes
> > > > in
> > > > > > > > > parallel?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Feb 2, 2022 at 4:02 PM Gyula Fóra <
> > > > > > > > gyula.f...@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Danny!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the feedback :)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Versioning:
> > > > > > > > > > > > > > > Versioning will be independent from Flink and
> the
> > > > > > operator
> > > > > > > > will
> > > > > > > > > > > > depend
> > > > > > > > > > > > > > on a
> > > > > > > > > > > > > > > fixed flink version (in every given operator
> > > > version).
> > > > > > > > > > > > > > > This should be the exact same setup as with
> > > Stateful
> > > > > > > > Functions
> > > > > > > > > (
> > > > > > > > > > > > > > > https://github.com/apache/flink-statefun). So
> > > > > > independent
> > > > > > > > > > release
> > > > > > > > > > > > cycle
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > still within the Flink umbrella.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Deployment error handling:
> > > > > > > > > > > > > > > I think that's a very good point, as general
> > > > exception
> > > > > > > > handling
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > different failure scenarios is a tricky
> problem.
> > I
> > > > > think
> > > > > > > the
> > > > > > > > > > > > exception
> > > > > > > > > > > > > > > classifiers and retry strategies could avoid a
> > lot
> > > of
> > > > > > > manual
> > > > > > > > > > > > intervention
> > > > > > > > > > > > > > > from the user. We will definitely need to add
> > > > something
> > > > > > > like
> > > > > > > > > > this.
> > > > > > > > > > > > Once
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > have the repo created with the initial operator
> > > code
> > > > we
> > > > > > > > should
> > > > > > > > > > open
> > > > > > > > > > > > some
> > > > > > > > > > > > > > > tickets for this and put it on the short term
> > > > roadmap!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > Gyula
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Feb 2, 2022 at 4:50 PM Danny Cranmer <
> > > > > > > > > > > > dannycran...@apache.org>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hey team,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Great work on the FLIP, I am looking forward
> to
> > > > this
> > > > > > > one. I
> > > > > > > > > > agree
> > > > > > > > > > > > that
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > can move forward to the voting stage.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I have general feedback around how we will
> > handle
> > > > job
> > > > > > > > > > submission
> > > > > > > > > > > > > > failure
> > > > > > > > > > > > > > > > and retry. As discussed in the Rejected
> > > > Alternatives
> > > > > > > > section,
> > > > > > > > > > we
> > > > > > > > > > > > can
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > Java to handle job submission failures from
> the
> > > > Flink
> > > > > > > > client.
> > > > > > > > > > It
> > > > > > > > > > > > would
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > useful to have the ability to configure
> > exception
> > > > > > > > classifiers
> > > > > > > > > > and
> > > > > > > > > > > > retry
> > > > > > > > > > > > > > > > strategy as part of operator configuration.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Given this will be in a separate Github
> > > repository
> > > > I
> > > > > am
> > > > > > > > > curious
> > > > > > > > > > > how
> > > > > > > > > > > > > > ther
> > > > > > > > > > > > > > > > versioning strategy will work in relation to
> > the
> > > > > Flink
> > > > > > > > > version?
> > > > > > > > > > > Do
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > any other components with a similar setup I
> can
> > > > look
> > > > > > at?
> > > > > > > > Will
> > > > > > > > > > the
> > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > version track Flink or will it use its own
> > > > versioning
> > > > > > > > > strategy
> > > > > > > > > > > > with a
> > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > version support matrix, or similar?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Feb 1, 2022 at 2:33 PM Márton
> Balassi <
> > > > > > > > > > > > > > balassi.mar...@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi team,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you for the great feedback, Thomas
> has
> > > > > updated
> > > > > > > the
> > > > > > > > > FLIP
> > > > > > > > > > > > page
> > > > > > > > > > > > > > > > > accordingly. If you are comfortable with
> the
> > > > > > currently
> > > > > > > > > > existing
> > > > > > > > > > > > > > design
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > depth in the FLIP [1] I suggest moving
> > forward
> > > to
> > > > > the
> > > > > > > > > voting
> > > > > > > > > > > > stage -
> > > > > > > > > > > > > > > once
> > > > > > > > > > > > > > > > > that reaches a positive conclusion it lets
> us
> > > > > create
> > > > > > > the
> > > > > > > > > > > separate
> > > > > > > > > > > > > > code
> > > > > > > > > > > > > > > > > repository under the flink project for the
> > > > > operator.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I encourage everyone to keep improving the
> > > > details
> > > > > in
> > > > > > > the
> > > > > > > > > > > > meantime,
> > > > > > > > > > > > > > > > however
> > > > > > > > > > > > > > > > > I believe given the existing design and the
> > > > general
> > > > > > > > > sentiment
> > > > > > > > > > > on
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > thread that the most efficient path from
> here
> > > is
> > > > > > > starting
> > > > > > > > > the
> > > > > > > > > > > > > > > > > implementation so that we can collectively
> > > > iterate
> > > > > > over
> > > > > > > > it.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Jan 31, 2022 at 10:15 PM Thomas
> > Weise <
> > > > > > > > > > t...@apache.org>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > HI Xintong,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for the feedback and please see
> > > > responses
> > > > > > > below
> > > > > > > > > -->
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Jan 28, 2022 at 12:21 AM Xintong
> > > Song <
> > > > > > > > > > > > > > tonysong...@gmail.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks Thomas for drafting this FLIP,
> and
> > > > > > everyone
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > > > > discussion.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I also have a few questions and
> comments.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > ## Job Submission
> > > > > > > > > > > > > > > > > > > Deploying a Flink session cluster via
> > > > kubectl &
> > > > > > CR
> > > > > > > > and
> > > > > > > > > > then
> > > > > > > > > > > > > > > > submitting
> > > > > > > > > > > > > > > > > > jobs
> > > > > > > > > > > > > > > > > > > to the cluster via Flink cli / REST is
> > > > probably
> > > > > > the
> > > > > > > > > > > approach
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > requires
> > > > > > > > > > > > > > > > > > > the least effort. However, I'd like to
> > > point
> > > > > out
> > > > > > 2
> > > > > > > > > > > > weaknesses.
> > > > > > > > > > > > > > > > > > > 1. A lot of users use Flink in
> > > > > perjob/application
> > > > > > > > > modes.
> > > > > > > > > > > For
> > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > > > > having to run the job in two steps
> > (deploy
> > > > the
> > > > > > > > cluster,
> > > > > > > > > > and
> > > > > > > > > > > > > > submit
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > job)
> > > > > > > > > > > > > > > > > > > is not that convenient.
> > > > > > > > > > > > > > > > > > > 2. One of our motivations is being able
> > to
> > > > > manage
> > > > > > > > Flink
> > > > > > > > > > > > > > > applications'
> > > > > > > > > > > > > > > > > > > lifecycles with kubectl. Submitting
> jobs
> > > from
> > > > > cli
> > > > > > > > > sounds
> > > > > > > > > > > not
> > > > > > > > > > > > > > > aligned
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > this motivation.
> > > > > > > > > > > > > > > > > > > I think it's probably worth it to
> support
> > > > > > > submitting
> > > > > > > > > jobs
> > > > > > > > > > > via
> > > > > > > > > > > > > > > > kubectl &
> > > > > > > > > > > > > > > > > > CR
> > > > > > > > > > > > > > > > > > > in the first version, both together
> with
> > > > > > deploying
> > > > > > > > the
> > > > > > > > > > > > cluster
> > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > perjob/application mode and after
> > deploying
> > > > the
> > > > > > > > cluster
> > > > > > > > > > > like
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > session
> > > > > > > > > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The intention is to support application
> > > > > management
> > > > > > > > > through
> > > > > > > > > > > > operator
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > CR,
> > > > > > > > > > > > > > > > > > which means there won't be any 2 step
> > > > submission
> > > > > > > > process,
> > > > > > > > > > > > which as
> > > > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > > allude to would defeat the purpose of
> this
> > > > > project.
> > > > > > > The
> > > > > > > > > CR
> > > > > > > > > > > > example
> > > > > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > > > > the application part. Please note that
> the
> > > bare
> > > > > > > cluster
> > > > > > > > > > > > support is
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > *additional* feature for scenarios that
> > > require
> > > > > > > > external
> > > > > > > > > > job
> > > > > > > > > > > > > > > > management.
> > > > > > > > > > > > > > > > > Is
> > > > > > > > > > > > > > > > > > there anything on the FLIP page that
> > creates
> > > a
> > > > > > > > different
> > > > > > > > > > > > > > impression?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > ## Versioning
> > > > > > > > > > > > > > > > > > > Which Flink versions does the operator
> > plan
> > > > to
> > > > > > > > support?
> > > > > > > > > > > > > > > > > > > 1. Native K8s deployment was firstly
> > > > introduced
> > > > > > in
> > > > > > > > > Flink
> > > > > > > > > > > 1.10
> > > > > > > > > > > > > > > > > > > 2. Native K8s HA was introduced in
> Flink
> > > 1.12
> > > > > > > > > > > > > > > > > > > 3. The Pod template support was
> > introduced
> > > in
> > > > > > Flink
> > > > > > > > > 1.13
> > > > > > > > > > > > > > > > > > > 4. There was some changes to the Flink
> > > docker
> > > > > > image
> > > > > > > > > > > > entrypoint
> > > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > > in,
> > > > > > > > > > > > > > > > > > > IIRC, Flink 1.13
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Great, thanks for providing this. It is
> > > > important
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > > > > compatibility
> > > > > > > > > > > > > > > > > > going forward also. We are targeting
> Flink
> > > > 1.14.x
> > > > > > > > > upwards.
> > > > > > > > > > > > Before
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > operator is ready there will be another
> > Flink
> > > > > > > release.
> > > > > > > > > > Let's
> > > > > > > > > > > > see if
> > > > > > > > > > > > > > > > > anyone
> > > > > > > > > > > > > > > > > > is interested in earlier versions?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > ## Compatibility
> > > > > > > > > > > > > > > > > > > What kind of API compatibility we can
> > > commit
> > > > > to?
> > > > > > > It's
> > > > > > > > > > > > probably
> > > > > > > > > > > > > > fine
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > alpha / beta version APIs that allow
> > > > > incompatible
> > > > > > > > > future
> > > > > > > > > > > > changes
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > first version. But eventually we would
> > need
> > > > to
> > > > > > > > > guarantee
> > > > > > > > > > > > > > backwards
> > > > > > > > > > > > > > > > > > > compatibility, so that an early version
> > CR
> > > > can
> > > > > > work
> > > > > > > > > with
> > > > > > > > > > a
> > > > > > > > > > > > new
> > > > > > > > > > > > > > > > version
> > > > > > > > > > > > > > > > > > > operator.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Another great point and please let me
> > include
> > > > > that
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > FLIP
> > > > > > > > > > > > > > page.
> > > > > > > > > > > > > > > > ;-)
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I think we should allow incompatible
> > changes
> > > > for
> > > > > > the
> > > > > > > > > first
> > > > > > > > > > > one
> > > > > > > > > > > > or
> > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > versions, similar to how other major
> > features
> > > > > have
> > > > > > > > > evolved
> > > > > > > > > > > > > > recently,
> > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > as FLIP-27.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Would be great to get broader feedback on
> > > this
> > > > > one.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > Thomas
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Fri, Jan 28, 2022 at 1:18 PM Thomas
> > > Weise
> > > > <
> > > > > > > > > > > t...@apache.org
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > # 1 Flink Native vs Standalone
> > > > integration
> > > > > > > > > > > > > > > > > > > > > Maybe we should make this more
> clear
> > in
> > > > the
> > > > > > > FLIP
> > > > > > > > > but
> > > > > > > > > > we
> > > > > > > > > > > > > > agreed
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > first version of the operator based
> > on
> > > > the
> > > > > > > native
> > > > > > > > > > > > > > integration.
> > > > > > > > > > > > > > > > > > > > > While this clearly does not cover
> all
> > > > > > use-cases
> > > > > > > > and
> > > > > > > > > > > > > > > requirements,
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > > > > > > > this would lead to a much smaller
> > > initial
> > > > > > > effort
> > > > > > > > > and
> > > > > > > > > > a
> > > > > > > > > > > > nicer
> > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > version.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I'm also leaning towards the native
> > > > > > integration,
> > > > > > > as
> > > > > > > > > > long
> > > > > > > > > > > > as it
> > > > > > > > > > > > > > > > > reduces
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > MVP effort. Ultimately the operator
> > will
> > > > need
> > > > > > to
> > > > > > > > also
> > > > > > > > > > > > support
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > standalone mode. I would like to gain
> > > more
> > > > > > > > confidence
> > > > > > > > > > > that
> > > > > > > > > > > > > > native
> > > > > > > > > > > > > > > > > > > > integration reduces the effort. While
> > it
> > > > cuts
> > > > > > the
> > > > > > > > > > effort
> > > > > > > > > > > to
> > > > > > > > > > > > > > > handle
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > TM
> > > > > > > > > > > > > > > > > > > > pod creation, some mapping code from
> > the
> > > CR
> > > > > to
> > > > > > > the
> > > > > > > > > > native
> > > > > > > > > > > > > > > > integration
> > > > > > > > > > > > > > > > > > > > client and config needs to be
> created.
> > As
> > > > > > > mentioned
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > FLIP,
> > > > > > > > > > > > > > > > > native
> > > > > > > > > > > > > > > > > > > > integration requires the Flink job
> > > manager
> > > > to
> > > > > > > have
> > > > > > > > > > access
> > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > k8s
> > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > create pods, which in some scenarios
> > may
> > > be
> > > > > > seen
> > > > > > > as
> > > > > > > > > > > > > > unfavorable.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >  > > > # Pod Template
> > > > > > > > > > > > > > > > > > > > > > > Is the pod template in CR same
> > with
> > > > > what
> > > > > > > > Flink
> > > > > > > > > > has
> > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > > > supported[4]?
> > > > > > > > > > > > > > > > > > > > > > > Then I am afraid not the
> > arbitrary
> > > > > > > field(e.g.
> > > > > > > > > > > > cpu/memory
> > > > > > > > > > > > > > > > > > resources)
> > > > > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > > > take effect.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Yes, pod template would look almost
> > > > > identical.
> > > > > > > > There
> > > > > > > > > > are
> > > > > > > > > > > a
> > > > > > > > > > > > few
> > > > > > > > > > > > > > > > > settings
> > > > > > > > > > > > > > > > > > > > that the operator will control (and
> > that
> > > > may
> > > > > > need
> > > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > > > > > > blacklisted),
> > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > in general we would not want to place
> > > > > > > > restrictions. I
> > > > > > > > > > > > think a
> > > > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > > > where a pod template is merged from
> > > > multiple
> > > > > > > layers
> > > > > > > > > > would
> > > > > > > > > > > > also
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > interesting to make this more
> flexible.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > > Thomas
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to