Re: [I] Kafka 4.x Queue Semantics support in Kafka Ingestion (druid)

via GitHub Mon, 13 Apr 2026 06:57:59 -0700


gianm commented on issue #18439:
URL: https://github.com/apache/druid/issues/18439#issuecomment-4236926773


   > The plan is to Share group related Supervisor, IndexTask and all utilities 
to support the share group implementation and shift all the kafka records state 
in broker side (not storing in the druid side - which can improve the druid 
performance, computation resources and complexity).
   
   The share group idea is interesting because it appears it would allow 
multiple tasks to process the same partition. This would be useful in 
situations where a single partition, or small set of partitions, has more data 
to process than the other partitions. It requires us to give up strict message 
ordering within a partition. This is generally not very important to the Druid 
use case, so giving that up doesn't seem like a big deal. However I am not 
clear on how exactly-once processing is going to work.
   
   Currently we get exactly once processing by storing Kafka consumer state in 
Druid's metadata store. When we publish segments we update the consumer state 
in the same transaction. How would this work if consumer state is stored at the 
Kafka side?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Kafka 4.x Queue Semantics support in Kafka Ingestion (druid)

Reply via email to