That's right - it should not help significantly assuming even
distribution of leaders and even distribution of partition volume
(average inbound messages/sec).

Theo's use-case is a bit different though in which you want to avoid
cross-zone consumer reads especially if you have a high fan-out in
number of consumers.

On Wed, May 27, 2015 at 05:56:56PM +0000, Aditya Auradkar wrote:
> Is that necessarily the case? On a cluster hosting partitions, assuming the 
> leaders are evenly distributed, every node should receive a roughly equal 
> share of the traffic. It does help a lot when the consumer throughput of a 
> single partition exceeds the capacity of a single leader but at that point 
> the topic ideally needs more partitions.
> 
> Aditya
> 
> ________________________________________
> From: James Cheng [jch...@tivo.com]
> Sent: Wednesday, May 27, 2015 10:50 AM
> To: users@kafka.apache.org
> Subject: Re: Is fetching from in-sync replicas possible?
> 
> On May 26, 2015, at 1:44 PM, Joel Koshy <jjkosh...@gmail.com> wrote:
> 
> >> Apologies if this question has been asked before. If I understand things
> >> correctly a client can only fetch from the leader of a partition, not from
> >> an (in-sync) replica. I have a use case where it would be very beneficial
> >> if it were possible to fetch from a replica instead of just the leader, and
> >> I wonder why it is not allowed? Are there any consistency problems with
> >> allowing it, for example? Is there any way to configure Kafka to allow it?
> >
> > Yes this should be possible.  I don't think there are any consistency
> > issues (barring any bugs) since we never expose past the
> > high-watermark and the follower HW is strictly <= leader HW. Can you
> > file a jira for this?
> >
> 
> Wouldn't this allow Kafka to scale to handle a lot more consumer traffic? 
> Currently, consumers all have to read from the leader, which means that the 
> network/disk bandwidth of a particular leader is the bottleneck. If consumers 
> could read from in-sync replicas, then a single node no longer is the 
> bottleneck for reads. You could scale out your read capacity as far as you 
> want.
> 
> -James
> 
> 
> >> The use case is a Kafka cluster running in EC2 across three availability
> >> zones.
> >
> > Out of curiosity - what's the typical latency (distribution) you see
> > between zones?
> >
> > Joel
> 

Reply via email to