Well, it depends on your program.

The reason for the current task creating strategy are joins: If you have
two input topic that you want to join, the join happens on a
per-partition basis, ie, topic-p0 is joined to topicB-p0 etc and thus
both partitions must be assigned to the same task (to get co-partitioned
data processed together).

Note, that the following program would create independent tasks as it
consist of two independent sub-topologies:

builder.stream("topicA").filter().to();
builder.stream("topicB").filter().to();

However, the next program would be one sub-topology and thus we apply
the "join" rule (as we don't really know if you actually execute a join
or not when we create tasks):

KStream s1 = builder.stream("topicA");
builser.stream("topicB").merge(s1).filter().to();


Having said that, I agree that it would be a nice improvement to be more
clever about it. However, it not easy to do. There is actually a related
ticket: https://issues.apache.org/jira/browse/KAFKA-9282


Hope this helps.
  -Matthias

On 9/2/20 11:09 PM, Pushkar Deole wrote:
> Hi,
> 
> I came across articles where it is explained how parallelism is handled in
> kafka streams. This is what I collected:
> When the streams application is reading from multiple topics, the topic
> with maximum number of partitions is considered for instantiating stream
> tasks so 1 task is instantiated per partition.
> Now, if the stream task is reading from multiple topics then the partitions
> of multiple topics are shared among those stream tasks.
> 
> For example, Topic A and B has 5 partitions each then 5 tasks are
> instantiated and assigned to 5 stream threads where each task is assigned 1
> partition from Topic A and Topic B.
> 
> The question here is : if I would want 1 task to be created for each
> partition from the input topic then is this possible? e.g. I would want to
> have 5 tasks for topic A and 5 for B and then would want 10 threads to
> handle those. How can this be achieved?
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to