The state machine is going to replace the annoying if-else in the reconciler. It seems to have no conflicts with modular mechanism(aka observer, reconciler, etc.). And we could make them happen step by step.
Best, Yang Gyula Fóra <gyula.f...@gmail.com> 于2022年3月1日周二 13:53写道: > And just one more thing I forgot to add: > > The observer does not have to be a dummy monolithic logic. Different job > types will have different state machines with different observe > requirements and different observer implementations. I think this is > complexly normal. The observer can be aware of the expected state and what > it should observe on the happy path. > > Gyula > > On Tue, 1 Mar 2022 at 06:42, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > @Thomas: > > > > Thanks for the input! I completely agree with a well principled state > > machine based approach in the operator would be the best. > > > > You are right that the code contains mostly if then else based logic at > > the moment as it evolved after initial prototype to cover more scenarios. > > > > However I think the self contained observer - reconciler approach is very > > capable of implementing the suggested state machine based view in a very > > elegant way. > > > > Simply put, the observer is responsible for determining and recording the > > current state and any extra information that comes with that state. > > > > The reconciler then look at the current vs desired state and execute a > > state transition. > > > > Gyula > > > > On Tue, 1 Mar 2022 at 02:16, Thomas Weise <t...@apache.org> wrote: > > > >> Thanks for bringing the discussion. It's a good time to revisit this > >> area as the operator implementation becomes more complex. > >> > >> I think the most important part of the design are well defined states > >> and transitions. Unless that can be reasoned about, there will be > >> confusion. For example: > >> > >> 1) For job upgrade, it wasn't clear in which state reconciliation > >> should happen and as a result the implementation ended up incomplete. > >> 2) "JobStatusObserver" would attempt to list jobs before the REST API > >> is ready. And it would be used in session mode, even though the > >> session mode needs a different state machine. > >> > >> The implementation currently contains a lot of if-then-else logic, > >> which is hard to follow and difficult to maintain. It will make it > >> harder to introduce new states that would be necessary to implement > >> more advanced upgrade strategies, amongst others. > >> > >> Did you consider a state machine centric approach? See [1] for example. > >> > >> As Biao mentioned, observe -> reconcile may repeat and different > >> states will require different checks. Once a job (in job mode) is > >> running, there is no need to check the JM deployment etc. A monolithic > >> observer may not work so well for that. Rather, I think different > >> states have different monitoring needs that inform transitions, > >> including the actual state and changes made to the CR. > >> > >> It would also be good if the current state is directly reflected in > >> the CR status so that it is easier to check where the deployment is > >> at. > >> > >> Cheers, > >> Thomas > >> > >> > >> [1] > >> > https://github.com/lyft/flinkk8soperator/blob/master/docs/state_machine.md > >> > >> On Mon, Feb 28, 2022 at 8:16 AM Gyula Fóra <gyula.f...@gmail.com> > wrote: > >> > > >> > Hi All! > >> > > >> > Thank you for the feedback, I agree with what has been proposed to > >> include > >> > as much as possible in the actual resource status and make the > >> reconciler > >> > completely independent from the observer. > >> > > >> > @Biao Geng: > >> > Validation could depend on the current status (depending on how we > >> > implement the validation logic) so it might always be necessary (and > it > >> is > >> > also cheap). > >> > What you are saying with multiple observe -> reconcile cycles ties > back > >> to > >> > what Matyas said, that we should probably have an Observe loop until > we > >> > have a stable state ready for reconciliation, then reconcile once. > >> > > >> > So probably validate -> observe until stable -> reconcile -> observe > >> until > >> > stable > >> > > >> > Cheers, > >> > Gyula > >> > > >> > On Mon, Feb 28, 2022 at 4:49 PM Biao Geng <biaoge...@gmail.com> > wrote: > >> > > >> > > Hi Gyula, > >> > > Thanks for the discussion. It also makes senses to me on the > >> separation of > >> > > 3 components and Yang's proposal. > >> > > Just 1 follow-up thought after checking your PR: in the reconcile() > >> > > method of controller, IIUC, the real flow could be > >> > > `validate->observe->reconcile->validate->observe->reconcile...". The > >> > > validation phase seems to be required only when the creation of the > >> job > >> > > cluster and the upgrade of config. For phases like waiting the JM > from > >> > > deploying to ready, it is not mandatory and thus the flow can look > >> like > >> > > `validate->observe->reconcile->optional validate due to current > >> > > state->observe->reconcile...` > >> > > > >> > > Őrhidi Mátyás <matyas.orh...@gmail.com> 于2022年2月28日周一 21:26写道: > >> > > > >> > > > It is worth looking at the controller code in the spotify operator > >> too: > >> > > > > >> > > > > >> > > > >> > https://github.com/spotify/flink-on-k8s-operator/blob/master/controllers/flinkcluster/flinkcluster_controller.go > >> > > > > >> > > > It is looping in the 'observer phase' until it reaches a stable > >> state, > >> > > then > >> > > > it performs the necessary changes. > >> > > > > >> > > > Based on this I also suggest keeping the logic in separate > >> > > > modules(Validate->Observe->Reconcile). The three components might > >> not > >> > > > even be enough as we add more and more complexity to the code. > >> > > > > >> > > > Cheers, > >> > > > Matyas > >> > > > > >> > > > > >> > > > On Mon, Feb 28, 2022 at 2:03 PM Aitozi <gjying1...@gmail.com> > >> wrote: > >> > > > > >> > > > > Hi, Gyula > >> > > > > Thanks for driving this discussion. I second Yang Wang's > >> idea > >> > > that > >> > > > > it's better to make the `validator`, `observer` and `reconciler` > >> > > > > self-contained. I also prefer to define the `Observer` as an > >> interface > >> > > > and > >> > > > > we could define the statuses that `Observer` will expose. It > acts > >> like > >> > > > the > >> > > > > observer protocol between the `Observer` and `Reconciler`. > >> > > > > > >> > > > > Best, > >> > > > > Aitozi. > >> > > > > > >> > > > > Yang Wang <danrtsey...@gmail.com> 于2022年2月28日周一 16:28写道: > >> > > > > > >> > > > > > Thanks for posting the discussion here. > >> > > > > > > >> > > > > > > >> > > > > > Having the components `validator` `observer` `reconciler` > makes > >> lots > >> > > of > >> > > > > > sense. And the "Validate -> Observe -> Reconcile" > >> > > > > > flow seems natural to me. > >> > > > > > > >> > > > > > Regarding the implementation in the PR, instead of directly > >> using the > >> > > > > > observer in the reconciler, I lean to let the observer > >> > > > > > exports the results to the status(e.g. jobmanager deployment > >> status, > >> > > > rest > >> > > > > > port readiness, flink jobs status, etc.) and > >> > > > > > the reconciler reads it from the status. Then each component > is > >> more > >> > > > > > self-contained and the boundary will be clearer. > >> > > > > > > >> > > > > > > >> > > > > > Best, > >> > > > > > Yang > >> > > > > > > >> > > > > > Gyula Fóra <gyf...@apache.org> 于2022年2月28日周一 16:01写道: > >> > > > > > > >> > > > > > > Hi All! > >> > > > > > > > >> > > > > > > I would like to start a discussion thread regarding the > >> structure > >> > > of > >> > > > > > > the Kubernetes > >> > > > > > > Operator > >> > > > > > > < > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkDeploymentController.java > >> > > > > > > > > >> > > > > > > controller > >> > > > > > > flow. Based on some recent PR discussions we have no clear > >> > > consensus > >> > > > on > >> > > > > > the > >> > > > > > > structure and the expectations which can potentially lead to > >> back > >> > > and > >> > > > > > forth > >> > > > > > > changes and unnecessary complexity. > >> > > > > > > > >> > > > > > > *Background* > >> > > > > > > In the initial prototype we had a very basic flow: > >> > > > > > > 1. Observe flink job status > >> > > > > > > 2. (if observation successful) reconcile changes > >> > > > > > > 3. Reschedule reconcile with success/error > >> > > > > > > > >> > > > > > > This basic prototype flow could not cover all requirements > >> and did > >> > > > not > >> > > > > > > allow for things like waiting until Jobmanager deployment is > >> ready > >> > > > etc. > >> > > > > > > > >> > > > > > > To solve these shortcomings, some changes were introduced > >> recently > >> > > > here > >> > > > > > > < > https://github.com/apache/flink-kubernetes-operator/pull/21 > >> >. > >> > > While > >> > > > > > this > >> > > > > > > change introduced many improvements and safeguards it also > >> > > completely > >> > > > > > > changed the original controller flow. Now the reconciler is > >> > > > responsible > >> > > > > > for > >> > > > > > > ensuring that it can actually reconcile by checking the > >> deployment > >> > > > and > >> > > > > > > ports. The job status observation logic has also been moved > >> into > >> > > the > >> > > > > > actual > >> > > > > > > reconcile logic. > >> > > > > > > > >> > > > > > > > >> > > > > > > *Discussion Question*What controller flow would we like to > >> have? Do > >> > > > we > >> > > > > > want > >> > > > > > > to separate the observer from the reconciler or keep them > >> together? > >> > > > > > > > >> > > > > > > In my personal view, we should try to adopt a very simple > >> flow to > >> > > > make > >> > > > > > the > >> > > > > > > operator clean and modular. If possible I would like to > >> restore the > >> > > > > > > original flow with some modifications: > >> > > > > > > > >> > > > > > > 1. Validate deployment object > >> > > > > > > 2. Observe deployment and flink job status -> Return > >> comprehensive > >> > > > > > status > >> > > > > > > info > >> > > > > > > 3. Reconcile deployment based on observed status and > resource > >> > > > changes > >> > > > > > > (Both 2/3 should be able to reschedule immediately if > >> necessary) > >> > > > > > > > >> > > > > > > I think the Observer component should be able to describe > the > >> > > current > >> > > > > > > status of the deployment objects and the flink job to the > >> extent > >> > > that > >> > > > > the > >> > > > > > > reconciler can work with that information alone. If we do it > >> this > >> > > > way, > >> > > > > we > >> > > > > > > can also use the status information that the observer > >> provides to > >> > > > > produce > >> > > > > > > other events and aid operations like shutdown which depend > on > >> the > >> > > > > current > >> > > > > > > deployment status. > >> > > > > > > > >> > > > > > > I think this would satisfy our needs, but I might be missing > >> > > > something > >> > > > > > that > >> > > > > > > cannot be done if we structure the code this way. > >> > > > > > > > >> > > > > > > I have a PR open > >> > > > > > > < > >> > > https://github.com/apache/flink-kubernetes-operator/pull/26/commits > >> > > > > > >> > > > > > > which > >> > > > > > > includes some of these proposed changes (as the optional > >> second > >> > > > commit) > >> > > > > > so > >> > > > > > > that you can easily compare with the current state of the > >> operator. > >> > > > > > > > >> > > > > > > Please let us know what we think! > >> > > > > > > > >> > > > > > > Cheers, > >> > > > > > > Gyula > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >