Flink handles its parallelism independently from the number of partitions
in the topic(s) being read. The parallelism comes from whatever is set in
the cluster configuration, without any concern for the source's native
parallelism. If there are fewer kafka partitions than the flink
parallelism, then there will be idle source tasks with nothing to do. If
the flink source parallelism is less than the number of kafka partitions,
then the flink source tasks will consume from multiple partitions, as
needed.

On Fri, Dec 20, 2024 at 3:32 AM Guillermo Ortiz Fernández <
guillermo.ortiz.f...@gmail.com> wrote:

> I'm looking for how Flink defines parallelism for a Kafka source (
> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka/).
> How is it determined by default? Is it based on the number of partitions in
> the topic? I have some topics with hundreds of partitions, and such a high
> level of parallelism seems excessive.
>

Reply via email to