Re: [DISCUSS] Flink Kubernetes Operator controller flow

Yang Wang Tue, 01 Mar 2022 00:07:39 -0800

The state machine is going to replace the annoying if-else in the
reconciler.
It seems to have no conflicts with modular mechanism(aka observer,
reconciler, etc.).
And we could make them happen step by step.



Best,
Yang


Gyula Fóra <gyula.f...@gmail.com> 于2022年3月1日周二 13:53写道：

> And just one more thing I forgot to add:
>
> The observer does not have to be a dummy monolithic logic. Different job
> types will have different state machines with different observe
> requirements and different observer implementations. I think this is
> complexly normal. The observer can be aware of the expected state and what
> it should observe on the happy path.
>
> Gyula
>
> On Tue, 1 Mar 2022 at 06:42, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > @Thomas:
> >
> > Thanks for the input! I completely agree with a well principled state
> > machine based approach in the operator would be the best.
> >
> > You are right that the code contains mostly if then else based logic at
> > the moment as it evolved after initial prototype to cover more scenarios.
> >
> > However I think the self contained observer - reconciler approach is very
> > capable of implementing the suggested state machine based view in a very
> > elegant way.
> >
> > Simply put, the observer is responsible for determining and recording the
> > current state and any extra information that comes with that state.
> >
> > The reconciler then look at the current vs desired state and execute a
> > state transition.
> >
> > Gyula
> >
> > On Tue, 1 Mar 2022 at 02:16, Thomas Weise <t...@apache.org> wrote:
> >
> >> Thanks for bringing the discussion. It's a good time to revisit this
> >> area as the operator implementation becomes more complex.
> >>
> >> I think the most important part of the design are well defined states
> >> and transitions. Unless that can be reasoned about, there will be
> >> confusion. For example:
> >>
> >> 1) For job upgrade, it wasn't clear in which state reconciliation
> >> should happen and as a result the implementation ended up incomplete.
> >> 2) "JobStatusObserver" would attempt to list jobs before the REST API
> >> is ready. And it would be used in session mode, even though the
> >> session mode needs a different state machine.
> >>
> >> The implementation currently contains a lot of if-then-else logic,
> >> which is hard to follow and difficult to maintain. It will make it
> >> harder to introduce new states that would be necessary to implement
> >> more advanced upgrade strategies, amongst others.
> >>
> >> Did you consider a state machine centric approach? See [1] for example.
> >>
> >> As Biao mentioned, observe -> reconcile may repeat and different
> >> states will require different checks. Once a job (in job mode) is
> >> running, there is no need to check the JM deployment etc. A monolithic
> >> observer may not work so well for that. Rather, I think different
> >> states have different monitoring needs that inform transitions,
> >> including the actual state and changes made to the CR.
> >>
> >> It would also be good if the current state is directly reflected in
> >> the CR status so that it is easier to check where the deployment is
> >> at.
> >>
> >> Cheers,
> >> Thomas
> >>
> >>
> >> [1]
> >>
> https://github.com/lyft/flinkk8soperator/blob/master/docs/state_machine.md
> >>
> >> On Mon, Feb 28, 2022 at 8:16 AM Gyula Fóra <gyula.f...@gmail.com>
> wrote:
> >> >
> >> > Hi All!
> >> >
> >> > Thank you for the feedback, I agree with what has been proposed to
> >> include
> >> > as much as possible in the actual resource status and make the
> >> reconciler
> >> > completely independent from the observer.
> >> >
> >> > @Biao Geng:
> >> > Validation could depend on the current status (depending on how we
> >> > implement the validation logic) so it might always be necessary (and
> it
> >> is
> >> > also cheap).
> >> > What you are saying with multiple observe -> reconcile cycles ties
> back
> >> to
> >> > what Matyas said, that we should probably have an Observe loop until
> we
> >> > have a stable state ready for reconciliation, then reconcile once.
> >> >
> >> > So probably validate -> observe until stable -> reconcile -> observe
> >> until
> >> > stable
> >> >
> >> > Cheers,
> >> > Gyula
> >> >
> >> > On Mon, Feb 28, 2022 at 4:49 PM Biao Geng <biaoge...@gmail.com>
> wrote:
> >> >
> >> > > Hi Gyula,
> >> > > Thanks for the discussion. It also makes senses to me on the
> >> separation of
> >> > > 3 components and Yang's proposal.
> >> > > Just 1 follow-up thought after checking your PR: in the reconcile()
> >> > > method of controller, IIUC, the real flow could be
> >> > > `validate->observe->reconcile->validate->observe->reconcile...". The
> >> > > validation phase seems to be required only when the creation of the
> >> job
> >> > > cluster and the upgrade of config. For phases like waiting the JM
> from
> >> > > deploying to ready, it is not mandatory and thus the flow can look
> >> like
> >> > > `validate->observe->reconcile->optional validate due to current
> >> > > state->observe->reconcile...`
> >> > >
> >> > > Őrhidi Mátyás <matyas.orh...@gmail.com> 于2022年2月28日周一 21:26写道：
> >> > >
> >> > > > It is worth looking at the controller code in the spotify operator
> >> too:
> >> > > >
> >> > > >
> >> > >
> >>
> https://github.com/spotify/flink-on-k8s-operator/blob/master/controllers/flinkcluster/flinkcluster_controller.go
> >> > > >
> >> > > > It is looping in the 'observer phase' until it reaches a stable
> >> state,
> >> > > then
> >> > > > it performs the necessary changes.
> >> > > >
> >> > > > Based on this I also suggest keeping the logic in separate
> >> > > > modules(Validate->Observe->Reconcile). The three components might
> >> not
> >> > > > even be enough as we add more and more complexity to the code.
> >> > > >
> >> > > > Cheers,
> >> > > > Matyas
> >> > > >
> >> > > >
> >> > > > On Mon, Feb 28, 2022 at 2:03 PM Aitozi <gjying1...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > > Hi, Gyula
> >> > > > >       Thanks for driving this discussion. I second Yang Wang's
> >> idea
> >> > > that
> >> > > > > it's better to make the `validator`, `observer` and `reconciler`
> >> > > > > self-contained. I also prefer to define the `Observer` as an
> >> interface
> >> > > > and
> >> > > > > we could define the statuses that `Observer` will expose. It
> acts
> >> like
> >> > > > the
> >> > > > > observer protocol between the `Observer` and `Reconciler`.
> >> > > > >
> >> > > > > Best,
> >> > > > > Aitozi.
> >> > > > >
> >> > > > > Yang Wang <danrtsey...@gmail.com> 于2022年2月28日周一 16:28写道：
> >> > > > >
> >> > > > > > Thanks for posting the discussion here.
> >> > > > > >
> >> > > > > >
> >> > > > > > Having the components `validator` `observer` `reconciler`
> makes
> >> lots
> >> > > of
> >> > > > > > sense. And the "Validate -> Observe -> Reconcile"
> >> > > > > > flow seems natural to me.
> >> > > > > >
> >> > > > > > Regarding the implementation in the PR, instead of directly
> >> using the
> >> > > > > > observer in the reconciler, I lean to let the observer
> >> > > > > > exports the results to the status(e.g. jobmanager deployment
> >> status,
> >> > > > rest
> >> > > > > > port readiness, flink jobs status, etc.) and
> >> > > > > > the reconciler reads it from the status. Then each component
> is
> >> more
> >> > > > > > self-contained and the boundary will be clearer.
> >> > > > > >
> >> > > > > >
> >> > > > > > Best,
> >> > > > > > Yang
> >> > > > > >
> >> > > > > > Gyula Fóra <gyf...@apache.org> 于2022年2月28日周一 16:01写道：
> >> > > > > >
> >> > > > > > > Hi All!
> >> > > > > > >
> >> > > > > > > I would like to start a discussion thread regarding the
> >> structure
> >> > > of
> >> > > > > > > the Kubernetes
> >> > > > > > > Operator
> >> > > > > > > <
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >>
> https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkDeploymentController.java
> >> > > > > > > >
> >> > > > > > > controller
> >> > > > > > > flow. Based on some recent PR discussions we have no clear
> >> > > consensus
> >> > > > on
> >> > > > > > the
> >> > > > > > > structure and the expectations which can potentially lead to
> >> back
> >> > > and
> >> > > > > > forth
> >> > > > > > > changes and unnecessary complexity.
> >> > > > > > >
> >> > > > > > > *Background*
> >> > > > > > > In the initial prototype we had a very basic flow:
> >> > > > > > >  1. Observe flink job status
> >> > > > > > >  2. (if observation successful) reconcile changes
> >> > > > > > >  3. Reschedule reconcile with success/error
> >> > > > > > >
> >> > > > > > > This basic prototype flow could not cover all requirements
> >> and did
> >> > > > not
> >> > > > > > > allow for things like waiting until Jobmanager deployment is
> >> ready
> >> > > > etc.
> >> > > > > > >
> >> > > > > > > To solve these shortcomings, some changes were introduced
> >> recently
> >> > > > here
> >> > > > > > > <
> https://github.com/apache/flink-kubernetes-operator/pull/21
> >> >.
> >> > > While
> >> > > > > > this
> >> > > > > > > change introduced many improvements and safeguards it also
> >> > > completely
> >> > > > > > > changed the original controller flow. Now the reconciler is
> >> > > > responsible
> >> > > > > > for
> >> > > > > > > ensuring that it can actually reconcile by checking the
> >> deployment
> >> > > > and
> >> > > > > > > ports. The job status observation logic has also been moved
> >> into
> >> > > the
> >> > > > > > actual
> >> > > > > > > reconcile logic.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > *Discussion Question*What controller flow would we like to
> >> have? Do
> >> > > > we
> >> > > > > > want
> >> > > > > > > to separate the observer from the reconciler or keep them
> >> together?
> >> > > > > > >
> >> > > > > > > In my personal view, we should try to adopt a very simple
> >> flow to
> >> > > > make
> >> > > > > > the
> >> > > > > > > operator clean and modular. If possible I would like to
> >> restore the
> >> > > > > > > original flow with some modifications:
> >> > > > > > >
> >> > > > > > >  1. Validate deployment object
> >> > > > > > >  2. Observe deployment and flink job status -> Return
> >> comprehensive
> >> > > > > > status
> >> > > > > > > info
> >> > > > > > >  3. Reconcile deployment based on observed status and
> resource
> >> > > > changes
> >> > > > > > >  (Both 2/3 should be able to reschedule immediately if
> >> necessary)
> >> > > > > > >
> >> > > > > > > I think the Observer component should be able to describe
> the
> >> > > current
> >> > > > > > > status of the deployment objects and the flink job to the
> >> extent
> >> > > that
> >> > > > > the
> >> > > > > > > reconciler can work with that information alone. If we do it
> >> this
> >> > > > way,
> >> > > > > we
> >> > > > > > > can also use the status information that the observer
> >> provides to
> >> > > > > produce
> >> > > > > > > other events and aid operations like shutdown which depend
> on
> >> the
> >> > > > > current
> >> > > > > > > deployment status.
> >> > > > > > >
> >> > > > > > > I think this would satisfy our needs, but I might be missing
> >> > > > something
> >> > > > > > that
> >> > > > > > > cannot be done if we structure the code this way.
> >> > > > > > >
> >> > > > > > > I have a PR open
> >> > > > > > > <
> >> > > https://github.com/apache/flink-kubernetes-operator/pull/26/commits
> >> > > > >
> >> > > > > > > which
> >> > > > > > > includes some of these proposed changes (as the optional
> >> second
> >> > > > commit)
> >> > > > > > so
> >> > > > > > > that you can easily compare with the current state of the
> >> operator.
> >> > > > > > >
> >> > > > > > > Please let us know what we think!
> >> > > > > > >
> >> > > > > > > Cheers,
> >> > > > > > > Gyula
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >>
> >
>

Re: [DISCUSS] Flink Kubernetes Operator controller flow

Reply via email to