[ 
https://issues.apache.org/jira/browse/FLINK-32119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089131#comment-18089131
 ] 

Dennis-Mircea Ciupitu commented on FLINK-32119:
-----------------------------------------------

This will be partially covered by FLINK-39938 as it adds 
{{scaling.alignment.mode}} ({{BALANCED}} / {{EVENLY_SPREAD}} / {{OFF}} + plugin 
SPI), so the "make alignment configurable" part is handled.

The multi-topic skew core is still not fixed for sources that subscribes to 
multiple topics. The partition count is still collapsed to a single aggregate 
per source vertex in 
{{ScalingMetricCollector.updateKafkaPulsarSourceNumPartitions}} 
({{...collect(toSet()).size()}}), and a vertex can span multiple topics, so 
{{AlignmentContext.numSourcePartitions}} is one number with no per-topic 
breakdown that any mode could use.

The topic is already in the metric names, so per-topic plumbing is cheap, but 
whether it helps depends on the connector's assignment strategy (default Kafka 
assigns the union round-robin, so the aggregate is already right for 
count-skew), and data skew is a separate axis.

> Revise source partition skew logic 
> -----------------------------------
>
>                 Key: FLINK-32119
>                 URL: https://issues.apache.org/jira/browse/FLINK-32119
>             Project: Flink
>          Issue Type: Bug
>          Components: Autoscaler, Kubernetes Operator
>            Reporter: Maximilian Michels
>            Priority: Major
>
> After choosing the target parallelism for a vertex, we choose a higher 
> parallelism if that parallelism leads to evenly spreading the number of key 
> groups (=max parallelism).
> Sources don't have keyed state, so this adjustment does not make sense for 
> key groups. However, we internally limit the max parallelism of sources to 
> the number of partitions discovered. This prevents partition skew. 
> The partition skew logic currently doesn’t work correctly when there are 
> multiple topics because we use the total number of partitions discovered. 
> Using a single max parallelism doesn’t yield skew free partition distribution 
> then. However, this is also true for a single topic when the number of 
> partitions is a prime number or a not easily divisible number. 
> Hence, we should add an option to guarantee skew free partition distribution 
> which means using the total number of partitions when another configuration 
> is not possible. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to