Hi,

Here is a question related to parallelism of keyed-process-function that is
applied to the KeyedStream. For some code that looks like this

myStream.keyBy(...)
    .process(new MyKeyedProcessFunction())

    .process(<someOtherProcessFunction>).setParallelism(10)


On a Flink cluster with 5 TM nodes each with 10 task slots, and Job
parallelism = 5, the 5 subtasks of MyKeyedProcessFunction() do not get
distributed across all the 5 TM nodes evenly. Those typically get assigned
to one single TM node. However the 50 subtasks of <someOtherProcessFunction>
are always spread evenly across the 5 TMs.

Am I missing something? How can I get those MyKeyedProcessFunction()
subtasks distributed across all TM nodes evenly?

Thanks
Arti

Reply via email to