Hi Chesney, Just to double check with you, OperatorCoordinatorMetricGroup (annotated as @PublicEvolving) has already been introduced into Flink by FLIP-179 <https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics>. And that FLIP has got you +1.. Do you mean we should remove this OperatorCoordinatorMetricGroup?
Regards, Dong On Sat, Dec 10, 2022 at 1:33 AM Chesnay Schepler <ches...@apache.org> wrote: > As a whole I feel like this FLIP is overly complicated. A dedicated > coordinator MG implementation is overkill; it could just re-use the > existing Task/OperatorMGs to create the same structure we have on TMs, > similar to what we did with the Job MG. > > However, I'm not convinced that this is required anyway, because all the > example metrics you listed can be implemented on the TM side + > aggregating them in the external metrics backend. > > Since I'm on holidays soon, just so no one tries to pull a fast one on > me, if this were to go to a vote as-is I'd be against it. > > > On 09/12/2022 15:30, Dong Lin wrote: > > Hi Hang, > > > > Thanks for the FLIP! The FLIP looks good and it is pretty informative. > > > > I have just two minor comments regarding names: > > - Would it be useful to rename the config key as > > *metrics.scope.jm.job.operator-coordinator* for consistency with > > *metrics.scope.jm.job > > *(which is not named as *jm-job)? > > - Maybe rename the variable as SCOPE_NAMING_OPERATOR_COORDINATOR for > > simplicity and consistency with SCOPE_NAMING_OPERATOR (which is not named > > as SCOPE_NAMING_TM_JOB_OPERATOR)? > > > > Cheers, > > Dong > > > > > > > > On Thu, Dec 8, 2022 at 3:28 PM Hang Ruan <ruanhang1...@gmail.com> wrote: > > > >> Hi all, > >> > >> MengYue and I created FLIP-274[1] Introduce metric group for > >> OperatorCoordinator. OperatorCoordinator is the coordinator for runtime > >> operators and running on Job Manager. The coordination mechanism is > >> operator events between OperatorCoordinator and its all operators, the > >> coordination is more and more using in Flink, for example many Sources > and > >> Sinks depend on the mechanism to assign splits and coordinate commits to > >> external systems. The OperatorCoordinator is widely using in flink kafka > >> connector, flink pulsar connector, flink cdc connector, flink hudi > >> connector and so on. > >> > >> But there is not a suitable metric group scope for the > OperatorCoordinator > >> and not an implementation for the interface > OperatorCoordinatorMetricGroup. > >> These metrics in OperatorCoordinator could be how many splits/partitions > >> have been assigned to source readers, how many files have been written > out > >> by sink writers, these metrics not only help users to know the job > progress > >> but also make big job maintaining easier. Thus we propose the FLIP-274 > to > >> introduce a new metric group scope for OperatorCoordinator and provide > an > >> internal implementation for OperatorCoordinatorMetricGroup. > >> > >> Could you help review this FLIP when you get time? Any feedback is > >> appreciated! > >> > >> Best, > >> Hang > >> > >> [1] > >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-274%3A+Introduce+metric+group+for+OperatorCoordinator > >> > >