Hi,

Of all the parameters, num.replica.fetchers should be kept higher to 4 can
be of help.
Please try it out and let us know if it worked

Thanks,
Prabhjot
On Nov 28, 2015 4:59 PM, "Andreas Flinck" <andreas.fli...@digitalroute.com>
wrote:

> Hi!
>
> Here are our settings for the properties requested:
>
> num.network.threads=3
> socket.request.max.bytes=104857600
> socket.receive.buffer.bytes=1048576
> socket.send.buffer.bytes=1048576
>
> The following properties we don't set at all, so I guess they will default
> according to the documentation (within parenthesis):
>
> "num.replica.fetchers": (1)
> "replica.fetch.wait.max.ms": (500),
> "num.recovery.threads.per.data.dir": (1)
>
> The producer properties we explicitly set are the following;
>
> block.on.buffer.full=false
> client.id=MZ
> max.request.size=1048576
> acks=all
> retries=0
> timeout.ms=30000
> buffer.memory=67108864
> metadata.fetch.timeout.ms=3000
>
> Do let me know what you think about it! We are currently setting up some
> tests using the broker properties that you suggested.
>
> Regards
> Andreas
>
>
>
>
>
>
> ________________________________________
> Från: Prabhjot Bharaj <prabhbha...@gmail.com>
> Skickat: den 28 november 2015 11:37
> Till: users@kafka.apache.org
> Ämne: Re: What is the benefit of using acks=all and minover e.g. acks=3
>
> Hi,
>
> Clogging can happen if, as seems in your case, the requests are bounded by
> network.
> Just to confirm your configurations, does your broker configuration look
> like this?? :-
>
> "num.replica.fetchers": 4,
> "replica.fetch.wait.max.ms": 500,
> "num.recovery.threads.per.data.dir": 4,
>
>
> "num.network.threads": 8,
> "socket.request.max.bytes": 104857600,
> "socket.receive.buffer.bytes": 10485760,
> "socket.send.buffer.bytes": 10485760,
>
> Similarly, please share your producer config as well. I'm thinking may be
> it is related to tuning your cluster.
>
> Thanks,
> Prabhjot
>
>
> On Sat, Nov 28, 2015 at 3:54 PM, Andreas Flinck <
> andreas.fli...@digitalroute.com> wrote:
>
> > Great, thanks for the information! So it is definitely acks=all we want
> to
> > go for. Unfortunately we run into an blocking issue in our production
> like
> > test environment which we have not been able to find a solution for. So
> > here it is, ANY idea on how we could possibly find a solution is very
> much
> > appreciated!
> >
> > Environment:
> > Kafka version: kafka_2.11-0.8.2.1
> > 5 kafka brokers and 5 ZK on spread out on 5 hosts
> > Using new producer (async)
> >
> > Topic:
> > partitions=10
> > replication-factor=4
> > min.insync.replicas=2
> >
> > Default property values used for broker configs and producer.
> >
> > Scenario and problem:
> > Incoming diameter data (10k TPS) is sent to 5 topics via 5 producers
> which
> > is working great until we start another 5 producers sending to another 5
> > topics with the same rate (10k). What happens then is that the producers
> > sending to 2 of the topics fills up the buffer and the throughput becomes
> > very low, with BufferExhaustedExceptions for most of the messages. When
> > checking the latency for the problematic topics it becomes really high
> > (around 150ms). Stopping the 5 producers that were started in the second
> > round, the latency goes down to about 1 ms again and the buffer will go
> > back to normal. The load is not that high, about 10MB/s, it is not even
> > near disk bound.
> > So the questions right now are, why do we get such high latency to
> > specifically two topics when starting more producers, even though cpu and
> > disk load looks unproblematic? And why two topics specifically, is there
> an
> > order of what topics to prfioritize when things get clogged for some
> reason?
> >
> > Sorry for the quite messy description, we are all kind of new at kafka
> > here!
> >
> > BR
> > Andreas
> >
> > > On 28 Nov 2015, at 09:26, Prabhjot Bharaj <prabhbha...@gmail.com>
> wrote:
> > >
> > > Hi,
> > >
> > > This should help :)
> > >
> > > During my benchmarks, I noticed that if 5 node kafka cluster running 1
> > > topic is given a continuous injection of 50GB in one shot (using a
> > modified
> > > producer performance script, which writes my custom data to kafka), the
> > > last replica can sometimes lag and it used to catch up at a speed of
> 1GB
> > in
> > > 20-25 seconds. This lag increases if producer performance injects 200GB
> > in
> > > one shot.
> > >
> > > I'm not sure how it will behave with multiple topics.  it could have an
> > > impact on the overall throughput (because more partitions will be alive
> > on
> > > the same broker thereby dividing the network usage), but I have to test
> > it
> > > in staging environment
> > >
> > > Regards,
> > > Prabhjot
> > >
> > > On Sat, Nov 28, 2015 at 12:10 PM, Gwen Shapira <g...@confluent.io>
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> min.insync.replica is alive and well in 0.9 :)
> > >>
> > >> Normally, you will have 4 our of 4 replicas in sync. However if one of
> > the
> > >> replicas will fall behind, you will have 3 out of 4 in sync.
> > >> If you set min.insync.replica = 3, produce requests will fail if the
> > number
> > >> on in-sync replicas fall below 3.
> > >>
> > >> I hope this helps.
> > >>
> > >> Gwen
> > >>
> > >> On Fri, Nov 27, 2015 at 9:43 PM, Prabhjot Bharaj <
> prabhbha...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Hi Gwen,
> > >>>
> > >>> How about min.isr.replicas property?
> > >>> Is it still valid in the new version 0.9 ?
> > >>>
> > >>> We could get 3 out of 4 replicas in sync if we set it's value to 3.
> > >>> Correct?
> > >>>
> > >>> Thanks,
> > >>> Prabhjot
> > >>> On Nov 28, 2015 10:20 AM, "Gwen Shapira" <g...@confluent.io> wrote:
> > >>>
> > >>>> In your scenario, you are receiving acks from 3 replicas while it is
> > >>>> possible to have 4 in the ISR. This means that one replica can be up
> > to
> > >>>> 4000 messages (by default) behind others. If a leader crashes, there
> > is
> > >>> 33%
> > >>>> chance this replica will become the new leader, thereby losing up to
> > >> 4000
> > >>>> messages.
> > >>>>
> > >>>> acks = all requires all ISR to ack as long as they are in the ISR,
> > >>>> protecting you from this scenario (but leading to high latency if a
> > >>> replica
> > >>>> is hanging and is just about to drop out of the ISR).
> > >>>>
> > >>>> Also, note that in future versions acks > 1 was deprecated, to
> protect
> > >>>> against such subtle mistakes.
> > >>>>
> > >>>> Gwen
> > >>>>
> > >>>> On Fri, Nov 27, 2015 at 12:28 AM, Andreas Flinck <
> > >>>> andreas.fli...@digitalroute.com> wrote:
> > >>>>
> > >>>>> Hi all
> > >>>>>
> > >>>>> The reason why I need to know is that we have seen an issue when
> > >> using
> > >>>>> acks=all, forcing us to quickly find an alternative. I leave the
> > >> issue
> > >>>> out
> > >>>>> of this post, but will probably come back to that!
> > >>>>>
> > >>>>> My question is about acks=all and min.insync.replicas property.
> Since
> > >>> we
> > >>>>> have found a workaround for an issue by using acks>1 instead of all
> > >>>>> (absolutely no clue why at this moment), I would like to know what
> > >>>> benefit
> > >>>>> you get from e.g. acks=all and min.insync.replicas=3 instead of
> using
> > >>>>> acks=3 in a 5 broker cluster and replication-factor of 4. To my
> > >>>>> understanding you would get the exact level of durability and
> > >> security
> > >>>> from
> > >>>>> using either of those settings. However, I suspect this is not
> quite
> > >>> the
> > >>>>> case from finding hints without proper explanation that acks=all is
> > >>>>> preferred.
> > >>>>>
> > >>>>>
> > >>>>> Regards
> > >>>>> Andreas
> > >>>>
> > >>>
> > >>
> > >
> > >
> > >
> > > --
> > > ---------------------------------------------------------
> > > "There are only 10 types of people in the world: Those who understand
> > > binary, and those who don't"
> >
> >
>
>
> --
> ---------------------------------------------------------
> "There are only 10 types of people in the world: Those who understand
> binary, and those who don't"

Reply via email to