Why do you want to use a consumer group? If you have consumers in other
jobs, your beam job will fail to receive all messages it should for the
topic.

> It seems the code attempts to circumvent the partition assignment
mechanism provided by Kafka to use it's own.

All beam I/Os for partitioned sources do this. They use access to the
partitioning structure of the underlying system to track their progress
through each partition and provide feedback for scaling, as well as
tracking and enforcing exactly once processing semantics. In fact, most
interops with streaming data processing systems do this, you can see the
documentation of the flink kafka interop (
https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/#behind-the-scene)
that it does not use or respect the partition assignments.

> By doing that it prevents the user from using consumer groups.

Again, why do you (as a user) want to use consumer groups? What value does
it provide you?

-Daniel

On Sat, Apr 15, 2023 at 4:50 AM Shahar Frank <srf...@gmail.com> wrote:

> Hi All,
>
> Posting here as suggested here
> <https://github.com/apache/beam/issues/25978#issuecomment-1508530483>.
>
> I'm using KafkaIO to consume events from a Kafka topic.
> I've added "group.id" to the consumer properties.
> When running the pipeline I can see this value sent to Kafka in the
> consumer properties.
> The consumers created by KafkaIO fail to join the consumer group.
> Looking into the code I can see that nowhere is the consumer "subscribing"
> to the topic which is how KafkaConsumer should join a consumer group. It
> seems the code attempts to circumvent the partition assignment mechanism
> provided by Kafka to use it's own.
> By doing that it prevents the user from using consumer groups.
> Is that by intention? Is there any reason why the decision to avoid using
> consumer groups has been taken?
> I would love to see any documentation about that if possible please.
>
> Cheers,
> Shahar.
>
>
>
>
>

Reply via email to