gianm commented on issue #18439: URL: https://github.com/apache/druid/issues/18439#issuecomment-4236926773
> The plan is to Share group related Supervisor, IndexTask and all utilities to support the share group implementation and shift all the kafka records state in broker side (not storing in the druid side - which can improve the druid performance, computation resources and complexity). The share group idea is interesting because it appears it would allow multiple tasks to process the same partition. This would be useful in situations where a single partition, or small set of partitions, has more data to process than the other partitions. It requires us to give up strict message ordering within a partition. This is generally not very important to the Druid use case, so giving that up doesn't seem like a big deal. However I am not clear on how exactly-once processing is going to work. Currently we get exactly once processing by storing Kafka consumer state in Druid's metadata store. When we publish segments we update the consumer state in the same transaction. How would this work if consumer state is stored at the Kafka side? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
