Hi Peng Yuan!

While I do agree that savepoint path is a very important production
configuration there are a lot of other things that come to my mind:
 - savepoint dir
 - checkpoint dir
 - checkpoint interval/timeout
 - high availability settings (provider/storagedir etc)

just to name a few...

While these are all production critical, they have nice clean Flink config
settings to go with them. If we stand introducing these to jobspec we only
get confusion about priority order etc and it is going to be hard to change
or remove them in the future. In any case we should validate that these
configs exist in cases where users use a stateful upgrade mode for example.
This is something we need to add for sure.

As for the other options you mentioned like automatic savepoint generation
for instance, those deserve an independent discussion of their own I
believe :)

Cheers,
Gyula

On Tue, Feb 15, 2022 at 11:23 AM K Fred <yuanpengf...@gmail.com> wrote:

> Hi Matyas!
>
> Thanks for your reply!
> For 1. and 3. scenarios,I couldn't agree more with the podTemplate solution
> , i missed this part.
> For savepoint related configuration, I think it's very important to be
> specified in JobSpec, Because savepoint is a very common configuration for
> upgrading a job, if it has been placed in JobSpec can be obviously
> configured by the user. In addition, other advanced properties can be put
> into flinkConfiguration customized by expert users.
> A bunch of savepoint configuration as follows:
>
> > fromSavepoint——Job restart from
>
> autoSavepointSecond—— Automatically take a savepoint to the `savepointsDir`
> > every n seconds.
>
> savepointsDir—— Savepoints dir where to store automatically taken
> > savepoints
>
> savepointGeneration—— Update savepoint generation of job status for a
> > running job (should be defined in JobStatus)
>
>
> Best wishes,
> Peng Yuan.
>
> On Tue, Feb 15, 2022 at 4:41 PM Őrhidi Mátyás <matyas.orh...@gmail.com>
> wrote:
>
> > Hi Peng,
> >
> > Thanks for your feedback. Regarding 1. and 3. scenarios, the podTemplate
> > functionality in the operator could cover both. We also need to be
> careful
> > about introducing proxy parameters in the CRD spec. The savepoint path is
> > usually accompanied with a bunch of other configurations for example, so
> > users need to use configuration params anyway. What do you think?
> >
> > Best,
> > Matyas
> >
> > On Tue, Feb 15, 2022 at 8:58 AM K Fred <yuanpengf...@gmail.com> wrote:
> >
> > > Hi Gyula!
> > >
> > > I have reviewed the prototype design of flink-kubernetes-operator you
> > > submitted, and I have the following questions:
> > >
> > > 1.Can a Flink Jar package that supports pulling from the sidecar be
> added
> > > to the JobSpec? just like this:
> > >
> > > > initContainers:
> > > >       - name: downloader
> > > >         image: curlimages/curl
> > > >         env:
> > > >           - name: JAR_URL
> > > >             value:
> > > >
> > >
> >
> https://repo1.maven.org/maven2/org/apache/flink/flink-examples-streaming_2.12/1.14.3/flink-examples-streaming_2.12-1.14.3-WordCount.jar
> > > >           - name: DEST_PATH
> > > >             value: /cache/flink-app.jar
> > > >         command: ['sh', '-c', 'curl -o ${DEST_PATH} ${JAR_URL}']
> > >
> > > 2.Can we add savepoint path property to job specification?
> > > 3.Can we add an extra port to the JobManagerSpec and TaskManagerSpec to
> > > expose some service ,such as prometheus?The property can be this:
> > >
> > > > extraPorts:
> > > >       - name: prom
> > > >         containerPort: 9249
> > >
> > >
> > >
> > > Best wishes,
> > > Peng Yuan
> > >
> > > On Tue, Feb 15, 2022 at 12:23 AM Gyula Fóra <gyf...@apache.org> wrote:
> > >
> > > > Hi Flink Devs!
> > > >
> > > > We would like to present to you the first prototype of the
> > > > flink-kubernetes-operator that was built based on the FLIP and the
> > > > discussion on this mail thread. We would also like to call out some
> > > design
> > > > decisions that we have made regarding architecture components that
> were
> > > not
> > > > explicitly mentioned in the FLIP document/thread and give you the
> > > > opportunity to raise any concerns here.
> > > >
> > > > You can find the initial prototype here:
> > > > https://github.com/apache/flink-kubernetes-operator/pull/1
> > > >
> > > > We will leave the PR open for 1-2 days before merging to let people
> > > comment
> > > > on it, but please be mindful that this is an initial prototype with
> > many
> > > > rough edges. It is not intended to be a complete implementation of
> the
> > > FLIP
> > > > specs as that will take some more work from all of us :)
> > > >
> > > >
> > > > *Prototype feature set:*The prototype contains a basic working
> version
> > of
> > > > the flink-kubernetes-operator that supports deployment and lifecycle
> > > > management of a stateful native flink application. We have basic
> > support
> > > > for stateful and stateless upgrades, UI ingress, pod templates etc.
> > Error
> > > > handling at this point is largely missing.
> > > >
> > > >
> > > > *Features / design decisions that were not explicitly discussed in
> this
> > > > thread*
> > > >
> > > > *Basic Admission control using a Webhook*Standard resource admission
> > > > control in Kubernetes to validate and potentially reject resources is
> > > done
> > > > through Webhooks.
> > > >
> > > >
> > >
> >
> https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/
> > > > This is a necessary mechanism to give the user an upfront error when
> an
> > > > incorrect resource was submitted. In the Flink operator's case we
> need
> > to
> > > > validate that the FlinkDeployment yaml actually makes sense and does
> > not
> > > > contain erroneous config options that would inevitably lead to
> > > > deployment/job failures.
> > > >
> > > > We have implemented a simple webhook that we can use for this type of
> > > > validation, as a separate maven module (flink-kubernetes-webhook).
> The
> > > > webhook is an optional component and can be enabled or disabled
> during
> > > > deployment. To avoid pulling in new external dependencies we have
> used
> > > the
> > > > Flink Shaded Netty module to build the simple rest endpoint required.
> > If
> > > > the community feels that Netty adds unnecessary complexity to the
> > webhook
> > > > implementation we are open to alternative backends such as Springboot
> > for
> > > > instance which would practically eliminate all the boilerplate.
> > > >
> > > >
> > > > *Helm Chart for deployment*Helm charts provide an industry standard
> way
> > > of
> > > > managing kubernetes deployments. We have created a helm chart
> prototype
> > > > that can be used to deploy the operator together with all required
> > > > resources. The helm chart allows easy configuration for things like
> > > images,
> > > > namespaces etc and flags to control specific parts of the deployment
> > such
> > > > as RBAC or the webhook.
> > > >
> > > > The helm chart provided is intended to be a first version that worked
> > for
> > > > us during development but we expect to have a lot of iterations on it
> > > based
> > > > on the feedback from the community.
> > > >
> > > > *Acknowledgment*
> > > > We would like to thank everyone who has provided support and valuable
> > > > feedback on this FLIP.
> > > > We would also like to thank Yang Wang & Alexis Sarda-Espinosa
> > > specifically
> > > > for making their operators open source and available to us which had
> a
> > > big
> > > > impact on the FLIP and the prototype.
> > > >
> > > > We are looking forward to continuing development on the operator
> > together
> > > > with the broader community.
> > > > All work will be tracked using the ASF Jira from now on.
> > > >
> > > > Cheers,
> > > > Gyula
> > > >
> > > > On Mon, Feb 14, 2022 at 9:21 AM K Fred <yuanpengf...@gmail.com>
> wrote:
> > > >
> > > > > Hi Gyula,
> > > > >
> > > > > Thanks!
> > > > > It's great to see the project getting started and I can't wait to
> see
> > > the
> > > > > PR and start contributing code.😄😄😄
> > > > >
> > > > > Best Wishes!
> > > > > Peng Yuan
> > > > >
> > > > > On Mon, Feb 14, 2022 at 4:14 PM Gyula Fóra <gyula.f...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hi Peng Yuan!
> > > > > >
> > > > > > The repo is already created:
> > > > > > https://github.com/apache/flink-kubernetes-operator
> > > > > >
> > > > > > We will open the PR with the initial prototype later today, stay
> > > tuned
> > > > in
> > > > > > this thread! :)
> > > > > >
> > > > > > Cheers,
> > > > > > Gyula
> > > > > >
> > > > > > On Mon, Feb 14, 2022 at 9:09 AM K Fred <yuanpengf...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Has the project of flink-kubernetes-operator been created in
> > > github?
> > > > > > >
> > > > > > > Peng Yuan
> > > > > > >
> > > > > > > On Wed, Feb 9, 2022 at 1:23 AM Gyula Fóra <
> gyula.f...@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > I agree with flink-kubernetes-operator as the repo name :)
> > > > > > > > Don't have any better idea
> > > > > > > >
> > > > > > > > Gyula
> > > > > > > >
> > > > > > > > On Sat, Feb 5, 2022 at 2:41 AM Thomas Weise <t...@apache.org>
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > Thanks for the continued feedback and discussion. Looks
> like
> > we
> > > > are
> > > > > > > > > ready to start a VOTE, I will initiate it shortly.
> > > > > > > > >
> > > > > > > > > In parallel it would be good to find the repository name.
> > > > > > > > >
> > > > > > > > > My suggestion would be: flink-kubernetes-operator
> > > > > > > > >
> > > > > > > > > I thought "flink-operator" could be a bit misleading since
> > the
> > > > term
> > > > > > > > > operator already has a meaning in Flink.
> > > > > > > > >
> > > > > > > > > I also considered "flink-k8s-operator" but that would be
> > almost
> > > > > > > > > identical to existing operator implementations and could
> lead
> > > to
> > > > > > > > > confusion in the future.
> > > > > > > > >
> > > > > > > > > Thoughts?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Thomas
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Feb 4, 2022 at 5:15 AM Gyula Fóra <
> > > gyula.f...@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Danny,
> > > > > > > > > >
> > > > > > > > > > So far we have been focusing our dev efforts on the
> initial
> > > > > native
> > > > > > > > > > implementation with the team.
> > > > > > > > > > If the discussion and vote goes well for this FLIP we are
> > > > looking
> > > > > > > > forward
> > > > > > > > > > to contributing the initial version sometime next week
> > > (fingers
> > > > > > > > crossed).
> > > > > > > > > >
> > > > > > > > > > At that point I think we can already start the dev work
> to
> > > > > support
> > > > > > > the
> > > > > > > > > > standalone mode as well, especially if you can dedicate
> > some
> > > > > effort
> > > > > > > to
> > > > > > > > > > pushing that side.
> > > > > > > > > > Working together on this sounds like a great idea and we
> > > should
> > > > > > start
> > > > > > > > as
> > > > > > > > > > soon as possible! :)
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Gyula
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 4, 2022 at 2:07 PM Danny Cranmer <
> > > > > > > dannycran...@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I have been discussing this one with my team. We are
> > > > interested
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > Standalone mode, and are willing to contribute towards
> > the
> > > > > > > > > implementation.
> > > > > > > > > > > Potentially we can work together to support both modes
> in
> > > > > > parallel?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 2, 2022 at 4:02 PM Gyula Fóra <
> > > > > gyula.f...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Danny!
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the feedback :)
> > > > > > > > > > > >
> > > > > > > > > > > > Versioning:
> > > > > > > > > > > > Versioning will be independent from Flink and the
> > > operator
> > > > > will
> > > > > > > > > depend
> > > > > > > > > > > on a
> > > > > > > > > > > > fixed flink version (in every given operator
> version).
> > > > > > > > > > > > This should be the exact same setup as with Stateful
> > > > > Functions
> > > > > > (
> > > > > > > > > > > > https://github.com/apache/flink-statefun). So
> > > independent
> > > > > > > release
> > > > > > > > > cycle
> > > > > > > > > > > > but
> > > > > > > > > > > > still within the Flink umbrella.
> > > > > > > > > > > >
> > > > > > > > > > > > Deployment error handling:
> > > > > > > > > > > > I think that's a very good point, as general
> exception
> > > > > handling
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > different failure scenarios is a tricky problem. I
> > think
> > > > the
> > > > > > > > > exception
> > > > > > > > > > > > classifiers and retry strategies could avoid a lot of
> > > > manual
> > > > > > > > > intervention
> > > > > > > > > > > > from the user. We will definitely need to add
> something
> > > > like
> > > > > > > this.
> > > > > > > > > Once
> > > > > > > > > > > we
> > > > > > > > > > > > have the repo created with the initial operator code
> we
> > > > > should
> > > > > > > open
> > > > > > > > > some
> > > > > > > > > > > > tickets for this and put it on the short term
> roadmap!
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Gyula
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Feb 2, 2022 at 4:50 PM Danny Cranmer <
> > > > > > > > > dannycran...@apache.org>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hey team,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Great work on the FLIP, I am looking forward to
> this
> > > > one. I
> > > > > > > agree
> > > > > > > > > that
> > > > > > > > > > > we
> > > > > > > > > > > > > can move forward to the voting stage.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have general feedback around how we will handle
> job
> > > > > > > submission
> > > > > > > > > > > failure
> > > > > > > > > > > > > and retry. As discussed in the Rejected
> Alternatives
> > > > > section,
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > > use
> > > > > > > > > > > > > Java to handle job submission failures from the
> Flink
> > > > > client.
> > > > > > > It
> > > > > > > > > would
> > > > > > > > > > > be
> > > > > > > > > > > > > useful to have the ability to configure exception
> > > > > classifiers
> > > > > > > and
> > > > > > > > > retry
> > > > > > > > > > > > > strategy as part of operator configuration.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Given this will be in a separate Github repository
> I
> > am
> > > > > > curious
> > > > > > > > how
> > > > > > > > > > > ther
> > > > > > > > > > > > > versioning strategy will work in relation to the
> > Flink
> > > > > > version?
> > > > > > > > Do
> > > > > > > > > we
> > > > > > > > > > > > have
> > > > > > > > > > > > > any other components with a similar setup I can
> look
> > > at?
> > > > > Will
> > > > > > > the
> > > > > > > > > > > > operator
> > > > > > > > > > > > > version track Flink or will it use its own
> versioning
> > > > > > strategy
> > > > > > > > > with a
> > > > > > > > > > > > Flink
> > > > > > > > > > > > > version support matrix, or similar?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Feb 1, 2022 at 2:33 PM Márton Balassi <
> > > > > > > > > > > balassi.mar...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi team,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you for the great feedback, Thomas has
> > updated
> > > > the
> > > > > > FLIP
> > > > > > > > > page
> > > > > > > > > > > > > > accordingly. If you are comfortable with the
> > > currently
> > > > > > > existing
> > > > > > > > > > > design
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > depth in the FLIP [1] I suggest moving forward to
> > the
> > > > > > voting
> > > > > > > > > stage -
> > > > > > > > > > > > once
> > > > > > > > > > > > > > that reaches a positive conclusion it lets us
> > create
> > > > the
> > > > > > > > separate
> > > > > > > > > > > code
> > > > > > > > > > > > > > repository under the flink project for the
> > operator.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I encourage everyone to keep improving the
> details
> > in
> > > > the
> > > > > > > > > meantime,
> > > > > > > > > > > > > however
> > > > > > > > > > > > > > I believe given the existing design and the
> general
> > > > > > sentiment
> > > > > > > > on
> > > > > > > > > this
> > > > > > > > > > > > > > thread that the most efficient path from here is
> > > > starting
> > > > > > the
> > > > > > > > > > > > > > implementation so that we can collectively
> iterate
> > > over
> > > > > it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-212%3A+Introduce+Flink+Kubernetes+Operator
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Jan 31, 2022 at 10:15 PM Thomas Weise <
> > > > > > > t...@apache.org>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > HI Xintong,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the feedback and please see
> responses
> > > > below
> > > > > > -->
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Jan 28, 2022 at 12:21 AM Xintong Song <
> > > > > > > > > > > tonysong...@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks Thomas for drafting this FLIP, and
> > > everyone
> > > > > for
> > > > > > > the
> > > > > > > > > > > > > discussion.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I also have a few questions and comments.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ## Job Submission
> > > > > > > > > > > > > > > > Deploying a Flink session cluster via
> kubectl &
> > > CR
> > > > > and
> > > > > > > then
> > > > > > > > > > > > > submitting
> > > > > > > > > > > > > > > jobs
> > > > > > > > > > > > > > > > to the cluster via Flink cli / REST is
> probably
> > > the
> > > > > > > > approach
> > > > > > > > > that
> > > > > > > > > > > > > > > requires
> > > > > > > > > > > > > > > > the least effort. However, I'd like to point
> > out
> > > 2
> > > > > > > > > weaknesses.
> > > > > > > > > > > > > > > > 1. A lot of users use Flink in
> > perjob/application
> > > > > > modes.
> > > > > > > > For
> > > > > > > > > > > these
> > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > having to run the job in two steps (deploy
> the
> > > > > cluster,
> > > > > > > and
> > > > > > > > > > > submit
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > job)
> > > > > > > > > > > > > > > > is not that convenient.
> > > > > > > > > > > > > > > > 2. One of our motivations is being able to
> > manage
> > > > > Flink
> > > > > > > > > > > > applications'
> > > > > > > > > > > > > > > > lifecycles with kubectl. Submitting jobs from
> > cli
> > > > > > sounds
> > > > > > > > not
> > > > > > > > > > > > aligned
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > this motivation.
> > > > > > > > > > > > > > > > I think it's probably worth it to support
> > > > submitting
> > > > > > jobs
> > > > > > > > via
> > > > > > > > > > > > > kubectl &
> > > > > > > > > > > > > > > CR
> > > > > > > > > > > > > > > > in the first version, both together with
> > > deploying
> > > > > the
> > > > > > > > > cluster
> > > > > > > > > > > like
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > perjob/application mode and after deploying
> the
> > > > > cluster
> > > > > > > > like
> > > > > > > > > in
> > > > > > > > > > > > > session
> > > > > > > > > > > > > > > > mode.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The intention is to support application
> > management
> > > > > > through
> > > > > > > > > operator
> > > > > > > > > > > > and
> > > > > > > > > > > > > > CR,
> > > > > > > > > > > > > > > which means there won't be any 2 step
> submission
> > > > > process,
> > > > > > > > > which as
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > allude to would defeat the purpose of this
> > project.
> > > > The
> > > > > > CR
> > > > > > > > > example
> > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > the application part. Please note that the bare
> > > > cluster
> > > > > > > > > support is
> > > > > > > > > > > an
> > > > > > > > > > > > > > > *additional* feature for scenarios that require
> > > > > external
> > > > > > > job
> > > > > > > > > > > > > management.
> > > > > > > > > > > > > > Is
> > > > > > > > > > > > > > > there anything on the FLIP page that creates a
> > > > > different
> > > > > > > > > > > impression?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ## Versioning
> > > > > > > > > > > > > > > > Which Flink versions does the operator plan
> to
> > > > > support?
> > > > > > > > > > > > > > > > 1. Native K8s deployment was firstly
> introduced
> > > in
> > > > > > Flink
> > > > > > > > 1.10
> > > > > > > > > > > > > > > > 2. Native K8s HA was introduced in Flink 1.12
> > > > > > > > > > > > > > > > 3. The Pod template support was introduced in
> > > Flink
> > > > > > 1.13
> > > > > > > > > > > > > > > > 4. There was some changes to the Flink docker
> > > image
> > > > > > > > > entrypoint
> > > > > > > > > > > > script
> > > > > > > > > > > > > > in,
> > > > > > > > > > > > > > > > IIRC, Flink 1.13
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Great, thanks for providing this. It is
> important
> > > for
> > > > > the
> > > > > > > > > > > > compatibility
> > > > > > > > > > > > > > > going forward also. We are targeting Flink
> 1.14.x
> > > > > > upwards.
> > > > > > > > > Before
> > > > > > > > > > > the
> > > > > > > > > > > > > > > operator is ready there will be another Flink
> > > > release.
> > > > > > > Let's
> > > > > > > > > see if
> > > > > > > > > > > > > > anyone
> > > > > > > > > > > > > > > is interested in earlier versions?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ## Compatibility
> > > > > > > > > > > > > > > > What kind of API compatibility we can commit
> > to?
> > > > It's
> > > > > > > > > probably
> > > > > > > > > > > fine
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > alpha / beta version APIs that allow
> > incompatible
> > > > > > future
> > > > > > > > > changes
> > > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > first version. But eventually we would need
> to
> > > > > > guarantee
> > > > > > > > > > > backwards
> > > > > > > > > > > > > > > > compatibility, so that an early version CR
> can
> > > work
> > > > > > with
> > > > > > > a
> > > > > > > > > new
> > > > > > > > > > > > > version
> > > > > > > > > > > > > > > > operator.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Another great point and please let me include
> > that
> > > on
> > > > > the
> > > > > > > > FLIP
> > > > > > > > > > > page.
> > > > > > > > > > > > > ;-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think we should allow incompatible changes
> for
> > > the
> > > > > > first
> > > > > > > > one
> > > > > > > > > or
> > > > > > > > > > > two
> > > > > > > > > > > > > > > versions, similar to how other major features
> > have
> > > > > > evolved
> > > > > > > > > > > recently,
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > > as FLIP-27.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Would be great to get broader feedback on this
> > one.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > Thomas
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Jan 28, 2022 at 1:18 PM Thomas Weise
> <
> > > > > > > > t...@apache.org
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > # 1 Flink Native vs Standalone
> integration
> > > > > > > > > > > > > > > > > > Maybe we should make this more clear in
> the
> > > > FLIP
> > > > > > but
> > > > > > > we
> > > > > > > > > > > agreed
> > > > > > > > > > > > to
> > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > first version of the operator based on
> the
> > > > native
> > > > > > > > > > > integration.
> > > > > > > > > > > > > > > > > > While this clearly does not cover all
> > > use-cases
> > > > > and
> > > > > > > > > > > > requirements,
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > seems
> > > > > > > > > > > > > > > > > > this would lead to a much smaller initial
> > > > effort
> > > > > > and
> > > > > > > a
> > > > > > > > > nicer
> > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > version.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I'm also leaning towards the native
> > > integration,
> > > > as
> > > > > > > long
> > > > > > > > > as it
> > > > > > > > > > > > > > reduces
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > MVP effort. Ultimately the operator will
> need
> > > to
> > > > > also
> > > > > > > > > support
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > standalone mode. I would like to gain more
> > > > > confidence
> > > > > > > > that
> > > > > > > > > > > native
> > > > > > > > > > > > > > > > > integration reduces the effort. While it
> cuts
> > > the
> > > > > > > effort
> > > > > > > > to
> > > > > > > > > > > > handle
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > TM
> > > > > > > > > > > > > > > > > pod creation, some mapping code from the CR
> > to
> > > > the
> > > > > > > native
> > > > > > > > > > > > > integration
> > > > > > > > > > > > > > > > > client and config needs to be created. As
> > > > mentioned
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > FLIP,
> > > > > > > > > > > > > > native
> > > > > > > > > > > > > > > > > integration requires the Flink job manager
> to
> > > > have
> > > > > > > access
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > k8s
> > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > create pods, which in some scenarios may be
> > > seen
> > > > as
> > > > > > > > > > > unfavorable.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >  > > > # Pod Template
> > > > > > > > > > > > > > > > > > > > Is the pod template in CR same with
> > what
> > > > > Flink
> > > > > > > has
> > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > supported[4]?
> > > > > > > > > > > > > > > > > > > > Then I am afraid not the arbitrary
> > > > field(e.g.
> > > > > > > > > cpu/memory
> > > > > > > > > > > > > > > resources)
> > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > take effect.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Yes, pod template would look almost
> > identical.
> > > > > There
> > > > > > > are
> > > > > > > > a
> > > > > > > > > few
> > > > > > > > > > > > > > settings
> > > > > > > > > > > > > > > > > that the operator will control (and that
> may
> > > need
> > > > > to
> > > > > > be
> > > > > > > > > > > > > blacklisted),
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > in general we would not want to place
> > > > > restrictions. I
> > > > > > > > > think a
> > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > where a pod template is merged from
> multiple
> > > > layers
> > > > > > > would
> > > > > > > > > also
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > > interesting to make this more flexible.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > Thomas
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to