Ismael Juma created KAFKA-3432:
----------------------------------

             Summary: Cluster.update() thread-safety
                 Key: KAFKA-3432
                 URL: https://issues.apache.org/jira/browse/KAFKA-3432
             Project: Kafka
          Issue Type: Improvement
            Reporter: Ismael Juma
            Priority: Critical
             Fix For: 0.10.0.0


A `Cluster.update()` method was introduced during the development of 0.10.0 so 
that `StreamPartitionAssignor` can add internal topics on-the-fly and give the 
augmented metadata to its underlying grouper.

`Cluster` was supposed to be immutable after construction and all 
synchronization happens via the `Metadata` instance. As far as I can see 
`Cluster.update()` is not thread-safe even though `Cluster` is accessed by 
multiple threads in some cases (I am not sure about the Streams case). Since 
this is a public API, it is important to fix this in my opinion.

A few options I can think of:
* Since `PartitionAssignor` is an internal class, change 
`PartitionAssignor.assign` to return a class containing the assignments and 
optionally an updated cluster. This is straightforward, but I am not sure if 
it's good enough for the Streams use-case. Can you please confirm [~guozhang]?
* Pass `Metadata` instead of `Cluster` to `PartitionAssignor.assign`, giving 
assignors the ability to update the metadata as needed.
* Make `Cluster` thread-safe in the face of mutations (without relying on 
synchronization at the `Metadata` level). This is not ideal, KAFKA-3428 shows 
that the synchronization at `Metadata` level is already too costly for high 
concurrency situations.

Thoughts [~guozhang], [~hachikuji]?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to