Just to give more context, my setup uses Apache Flink 1.18 with the
adaptive scheduler enabled, issues happen randomly particularly
post-restart behaviors.

After each restart, the system logs indicate "Adding split(s) to reader:",
signifying the reassignment of partitions across different TaskManagers. An
anomaly arises with specific partitions, for example, partition-10. This
partition does not appear in the logs immediately post-restart. It remains
unlogged for several hours, during which no data consumption from
partition-10 occurs. Subsequently, the logs display "Discovered new
partitions:", and only then does the consumption of data from partition-10
recommence.

Could you provide any insights or hypotheses regarding the underlying cause
of this delayed recognition and processing of certain partitions?

Best regards,
Yang

On Mon, 8 Jan 2024 at 16:24, Yang LI <yang.hunter...@gmail.com> wrote:

> Dear Flink Community,
>
> I've encountered an issue during the testing of my Flink autoscaler. It
> appears that Flink is losing track of specific Kafka partitions, leading to
> a persistent increase in lag on these partitions. The logs indicate a
> 'kafka connector metricGroup name collision exception.' Notably, the
> consumption on these Kafka partitions returns to normal after restarting
> the Kafka broker. For context, I have enabled in-place rescaling support
> with 'jobmanager.scheduler: Adaptive.'
>
> I suspect the problem may stem from:
>
> The in-place rescaling support triggering restarts of some taskmanagers.
> This process might not be restarting the metric groups registered by the
> Kafka source connector correctly, leading to a name collision exception and
> preventing Flink from accurately reporting metrics related to Kafka
> consumption.
> A potential edge case in the metric for pending records, especially when
> different partitions exhibit varying lags. This discrepancy might be
> causing the pending record metric to malfunction.
> I would appreciate your insights on these observations.
>
> Best regards,
> Yang LI
>

Reply via email to