Wouldn't this work only for producers using random partitioning?

On Tue, Oct 14, 2014 at 1:51 PM, Kyle Banker <kyleban...@gmail.com> wrote:

> Consider a 12-node Kafka cluster with a 200-parition topic having a
> replication factor of 3. Let's assume, in addition, that we're running
> Kafka v0.8.2, we've disabled unclean leader election, acks is -1, and
> min.isr is 2.
>
> Now suppose we lose 2 nodes. In this case, there's a good chance that 2/3
> replicas of one or more partitions will be unavailable. This means that
> messages assigned to those partitions will not be writable. If we're
> writing a large number of messages, I would expect that all producers would
> eventually halt. It is somewhat surprising that, if we rely on a basic
> durability setting, the cluster would likely be unavailable even after
> losing only 2 / 12 nodes.
>
> It might be useful in this scenario for the producer to be able to detect
> which partitions are no longer available and reroute messages that would
> have hashed to the unavailable partitions (as defined by our acks and
> min.isr settings). This way, the cluster as a whole would remain available
> for writes at the cost of a slightly higher load on the remaining machines.
>
> Is this limitation accurately described? Is the proposed producer
> functionality worth pursuing?
>

Reply via email to