Re: Capacity planning for Kafka Streams

Mahendra Kariya Fri, 17 Mar 2017 17:29:52 -0700

Thanks for the heads up Guozhang!

The problem is our brokers are on 0.10.0.x. So we will have to upgrade them.


On Sat, Mar 18, 2017 at 12:30 AM, Guozhang Wang <wangg...@gmail.com> wrote:

> Hi Mahendra,
>
> Just a kind reminder that upgrading Streams to 0.10.2 does not necessarily
> require you to upgrade brokers to 0.10.2 as well. Since we have added a new
> feature since 0.10.2 to allow newer versioned clients (producer, consumer,
> streams) to talk to older versioned brokers, and for Streams specifically
> it only requires brokers to be no older than 0.10.1.
>
>
> Guozhang
>
>
> On Mon, Mar 13, 2017 at 5:12 AM, Mahendra Kariya <
> mahendra.kar...@go-jek.com
> > wrote:
>
> > We are planning to migrate to the newer version of Kafka. But that's a
> few
> > weeks away.
> >
> > We will try setting the socket config and see how it turns out.
> >
> > Thanks a lot for your response!
> >
> >
> >
> > On Mon, Mar 13, 2017 at 3:21 PM, Eno Thereska <eno.there...@gmail.com>
> > wrote:
> >
> > > Thanks,
> > >
> > > A couple of things:
> > > - I’d recommend moving to 0.10.2 (latest release) if you can since
> > several
> > > improvements were made in the last two releases that make rebalancing
> and
> > > performance better.
> > >
> > > - When running on environments with large latency on AWS at least
> > (haven’t
> > > tried Google cloud), one parameter we have found useful to increase
> > > performance is the receive and send socket size for the consumer and
> > > producer in streams. We’d recommend setting them to 1MB like this
> (where
> > > “props” is your own properties object when you start streams):
> > >
> > > // the socket buffer needs to be large, especially when running in AWS
> > with
> > > // high latency. if running locally the default is fine.
> > > props.put(ProducerConfig.SEND_BUFFER_CONFIG, 1024 * 1024);
> > > props.put(ConsumerConfig.RECEIVE_BUFFER_CONFIG, 1024 * 1024);
> > >
> > > Make sure the OS allows the larger socket size too.
> > >
> > > Thanks
> > > Eno
> > >
> > > > On Mar 13, 2017, at 9:21 AM, Mahendra Kariya <
> > mahendra.kar...@go-jek.com>
> > > wrote:
> > > >
> > > > Hi Eno,
> > > >
> > > > Please find my answers inline.
> > > >
> > > >
> > > > We are in the process of documenting capacity planning for streams,
> > stay
> > > tuned.
> > > >
> > > > This would be great! Looking forward to it.
> > > >
> > > > Could you send some more info on your problem? What Kafka version are
> > > you using?
> > > >
> > > > We are using Kafka 0.10.0.0.
> > > >
> > > > Are the VMs on the same or different hosts?
> > > >
> > > > The VMs are on Google Cloud. Two of them are in asia-east1-a and one
> is
> > > in asia-east1-c. All three are n1-standard-4 Ubuntu instances.
> > > >
> > > > Also what exactly do you mean by “the lag keeps fluctuating”, what
> > > metric are you looking at?
> > > >
> > > > We are looking at Kafka Manager for the time being. By fluctuating, I
> > > mean the lag is few thousands at one time, we refresh it the next
> second,
> > > it is in few lakhs, and again refresh it and it is few thousands. I
> > > understand this may not be very accurate. We will soon have more
> accurate
> > > data once we start pushing the consumer lag metric to Datadog.
> > > >
> > > > But on a separate note, the difference between lags on different
> > > partitions is way too high. I have attached a tab separated file
> herewith
> > > which shows the consumer lag (from Kafka Manager) for the first the 50
> > > partitions. As is clear, the lag on partition 2 is 530 while the lag on
> > > partition 18 is 23K. Note that the same VM is pulling data from both
> the
> > > partitions.
> > > >
> > > >
> > > >
> > > >
> > > > <KafkaLags.tsv>
> > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: Capacity planning for Kafka Streams

Reply via email to