[ 
https://issues.apache.org/jira/browse/KAFKA-13136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17507968#comment-17507968
 ] 

Chris Egerton commented on KAFKA-13136:
---------------------------------------

This is not a bug in Kafka Connect per se, but rather a side effect of the 
default consumer partition assignor, which at the moment is the 
[RangeAssignor|https://github.com/apache/kafka/blob/fbe7fb941173c0907792a8b48e8e9122aabecbd8/clients/src/main/java/org/apache/kafka/clients/consumer/RangeAssignor.java].
 That assignor gives the first partition of every topic to a single consumer, 
the second partition of every topic to a single consumer, etc. This makes it 
particularly ill-suited for consumer groups (or multi-task sink connectors) 
that read from a large number of small topics, or more generally, any situation 
where the number of consumers (or sink tasks) is greater than the maximum 
number of partitions in any single topic being consumed, and there are multiple 
consumers in the group (or sink tasks for the connector).

This should be automatically addressed once 
[KIP-726|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=177048248]/KAFKA-12473
 get merged, as that will cause the default partition assignor to be updated to 
the 
[CooperativeStickyAssignor|https://github.com/apache/kafka/blob/fbe7fb941173c0907792a8b48e8e9122aabecbd8/clients/src/main/java/org/apache/kafka/clients/consumer/CooperativeStickyAssignor.java],
 which has more intelligent assignment logic.

Until then, or if running on older versions of Kafka Connect that in turn use 
older versions of the Kafka clients library which still use the 
{{RangeAssignor}} as the default, there are a few options to work around this 
problem:
 # You can use a new default partition assignor for every connector on the 
worker by setting the {{consumer.partition.assignment.strategy}} property in 
your Kafka Connect worker config file.
 # You can configure the partition assignor on a per-connector basis by setting 
the {{consumer.override.partition.assignment.strategy}} property in your 
connector config (as long as the worker is configured with a connector client 
override policy that permits this, which should be possible with the default 
override policy as of 3.0.0).

As far as which assignor to use goes--if all you need is a guarantee that the 
spread of partitions across sink tasks is as even as possible, then you can use 
the 
[RoundRobinAssignor|https://github.com/apache/kafka/blob/fbe7fb941173c0907792a8b48e8e9122aabecbd8/clients/src/main/java/org/apache/kafka/clients/consumer/RoundRobinAssignor.java].
 Beyond that, it's probably beyond the scope of this ticket to make 
recommendations, but you can do some research of your own starting with the 
[docs for the consumer partition.assignment.strategy 
property|https://kafka.apache.org/31/documentation.html#consumerconfigs_partition.assignment.strategy].

 

Given that there is a known workaround for this issue and an approved, 
permanent fix in the works for an upcoming release, this may be safe to close. 
[~raphaelauv] thoughts?

> kafka-connect task.max : active task in consumer group is limited by the 
> bigger topic to consume
> ------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13136
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13136
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: raphaelauv
>            Priority: Major
>
> In kafka-connect 2.7
> *The maximum number of active task for a sink connector is equal to the topic 
> with the biggest number of partitions to consume*
> An active task is a task with partitions attributed in the consumer-group of 
> the sink connector
> example :
> With 2 topics where each have 10 partitions ( 20 partitions in total )
> The maximum number of active task is 10 ( if I set task.max at 12 ,there is 
> 10 members of the consumer group consuming partitions and  2 members in the 
> consumer-group that do not have partitions to consume).
> If I add a third topic with 15 partitions to the connector conf then the 12 
> members of the consumer group are consuming partitions, and then if I set now 
> task.max at 17 only 15 members are active in the consumer-group.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to