Hi Mason,

Thank you for the proposal. This is a highly requested feature to make
the source scaling of Flink Autoscaling generic across all sources.
The current implementation handles every source individually, and if
we don't find any backlog metrics, we default to using busy time only.
At this point Kafka is the only supported source. We collect the
backlog size (pending metrics), as well as the number of available
splits / partitions.

For Kafka, we always read from all splits but I like how for the
generic interface we take note of both assigned and unassigned splits.
This allows for more flexible integration with other sources where we
might have additional splits we read from at a later point in time.

Considering Rui's point, I agree it makes sense to outline the
integration with existing sources. Other than that, +1 from my side
for the proposal.

Thanks,
Max

On Fri, Nov 17, 2023 at 4:06 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> Hi Mason,
>
> Thank you for driving this proposal!
>
> Currently, Autoscaler only supports the maximum source parallelism
> of KafkaSource. Introducing the generic metric to support it is good
> to me, +1 for this proposal.
>
> I have a question:
> You added the metric in the flink repo, and Autoscaler will fetch this
> metric. But I didn't see any connector to register this metric. Currently,
> only IteratorSourceEnumerator setUnassignedSplitsGauge,
> and KafkaSource didn't register it. IIUC, if we don't do it, autoscaler
> still cannot fetch this metric, right?
>
> If yes, I suggest this FLIP includes registering metric part, otherwise
> these metrics still cannot work.
>
> Please correct me if I misunderstood anything, thanks~
>
> Best,
> Rui
>
> On Fri, Nov 17, 2023 at 6:53 AM Mason Chen <mas.chen6...@gmail.com> wrote:
>
> > Hi all,
> >
> > I would like to start a discussion on FLIP-394: Add Metrics for Connector
> > Agnostic Autoscaling [1].
> >
> > This FLIP recommends adding two metrics to make autoscaling work for
> > bounded split source implementations like IcebergSource. These metrics are
> > required by the Flink Kubernetes Operator autoscaler algorithm [2] to
> > retrieve information for the backlog and the maximum source parallelism.
> > The changes would affect the `@PublicEvolving` `SplitEnumeratorMetricGroup`
> > API of the source connector framework.
> >
> > Best,
> > Mason
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-394%3A+Add+Metrics+for+Connector+Agnostic+Autoscaling
> > [2]
> >
> > https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/autoscaler/#limitations
> >

Reply via email to