Thanks Mason for your feedback and update! The sources you listed look good to me, +1 for this proposal!
Best, Rui On Sat, Nov 18, 2023 at 3:38 AM Mason Chen <mas.chen6...@gmail.com> wrote: > Also, it looks like externalizing the Hive connector is unblocked based on > the past email thread. https://issues.apache.org/jira/browse/FLINK-30064 > seems to have some progress and perhaps we shouldn't touch it for now. > > On Fri, Nov 17, 2023 at 11:00 AM Mason Chen <mas.chen6...@gmail.com> > wrote: > > > Hi Rui and Max, > > > > Thanks for the feedback! > > > > If yes, I suggest this FLIP includes registering metric part, otherwise > >> these metrics still cannot work. > > > > Yup, you understood it correctly. I'll add that to the required list of > > work. Note that I'll include only FLIP-27 sources in the Flink repo: > > FileSource, HybridSource, NumberSequenceSource, DataGeneratorSource, and > > HiveSource. > > > > I think externalized sources could be implemented outside of this FLIP, > as > > other sources would have to wait for a Flink minor release and it > wouldn't > > make sense to track something like IcebergSource here since it isn't part > > of the Flink project. KafkaSource is a special case since there is a lot > of > > usage and people use it also to verify Flink releases, so I'm open > > to tracking that here too. > > > > In addition, after some further thought on the `setAssignedSplitsGauge` > > metric, I think we need to track this in the SourceReaders. The reason is > > that splits can be "completed" and this is only tracked by readers. There > > are some poll based sources that only take 1 split and poll for another > > when completed, but we cannot make that assumption in general (i.e. > request > > split when a split is completed). So, this needs to be tracked in the > > reader. > > > > Best, > > Mason > > > > On Fri, Nov 17, 2023 at 2:39 AM Maximilian Michels <m...@apache.org> > wrote: > > > >> Hi Mason, > >> > >> Thank you for the proposal. This is a highly requested feature to make > >> the source scaling of Flink Autoscaling generic across all sources. > >> The current implementation handles every source individually, and if > >> we don't find any backlog metrics, we default to using busy time only. > >> At this point Kafka is the only supported source. We collect the > >> backlog size (pending metrics), as well as the number of available > >> splits / partitions. > >> > >> For Kafka, we always read from all splits but I like how for the > >> generic interface we take note of both assigned and unassigned splits. > >> This allows for more flexible integration with other sources where we > >> might have additional splits we read from at a later point in time. > >> > >> Considering Rui's point, I agree it makes sense to outline the > >> integration with existing sources. Other than that, +1 from my side > >> for the proposal. > >> > >> Thanks, > >> Max > >> > >> On Fri, Nov 17, 2023 at 4:06 AM Rui Fan <1996fan...@gmail.com> wrote: > >> > > >> > Hi Mason, > >> > > >> > Thank you for driving this proposal! > >> > > >> > Currently, Autoscaler only supports the maximum source parallelism > >> > of KafkaSource. Introducing the generic metric to support it is good > >> > to me, +1 for this proposal. > >> > > >> > I have a question: > >> > You added the metric in the flink repo, and Autoscaler will fetch this > >> > metric. But I didn't see any connector to register this metric. > >> Currently, > >> > only IteratorSourceEnumerator setUnassignedSplitsGauge, > >> > and KafkaSource didn't register it. IIUC, if we don't do it, > autoscaler > >> > still cannot fetch this metric, right? > >> > > >> > If yes, I suggest this FLIP includes registering metric part, > otherwise > >> > these metrics still cannot work. > >> > > >> > Please correct me if I misunderstood anything, thanks~ > >> > > >> > Best, > >> > Rui > >> > > >> > On Fri, Nov 17, 2023 at 6:53 AM Mason Chen <mas.chen6...@gmail.com> > >> wrote: > >> > > >> > > Hi all, > >> > > > >> > > I would like to start a discussion on FLIP-394: Add Metrics for > >> Connector > >> > > Agnostic Autoscaling [1]. > >> > > > >> > > This FLIP recommends adding two metrics to make autoscaling work for > >> > > bounded split source implementations like IcebergSource. These > >> metrics are > >> > > required by the Flink Kubernetes Operator autoscaler algorithm [2] > to > >> > > retrieve information for the backlog and the maximum source > >> parallelism. > >> > > The changes would affect the `@PublicEvolving` > >> `SplitEnumeratorMetricGroup` > >> > > API of the source connector framework. > >> > > > >> > > Best, > >> > > Mason > >> > > > >> > > [1] > >> > > > >> > > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-394%3A+Add+Metrics+for+Connector+Agnostic+Autoscaling > >> > > [2] > >> > > > >> > > > >> > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/#limitations > >> > > > >> > > >