Re: seeing poor consumer performance in 0.7.2

Neha Narkhede Fri, 26 Apr 2013 18:36:11 -0700

>    - Decreased num.partitions and log.flush.interval on the brokers from
>    64/10k to 32/100 in order to lower the average flush time (we were
>    previously always hitting the default flush interval since no
partitions


Hmm, that is a pretty low value for flush interval leading to higher disk
usage. Do you use dedicated disks for kafka data logs ? Also what sort of
disks do you use ?

Thanks,
Neha

>
>
> On Tue, Apr 23, 2013 at 7:53 AM, Jun Rao <jun...@gmail.com> wrote:
>
> > You can run kafka.tools.ConsumerOffsetChecker to check the consumer
lag. If
> > the consumer is lagging, this indicates a problem on the consumer side.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Apr 22, 2013 at 9:13 PM, Andrew Neilson <arsneil...@gmail.com
> > >wrote:
> >
> > > Hmm it is highly unlikely that that is the culprit... There is lots of
> > > bandwidth available for me to use. I will definitely keep that in mind
> > > though. I was working on this today and have some tidbits of
additional
> > > information and thoughts that you might be able to shed some light on:
> > >
> > >    - I mentioned I have 2 consumers, but each consumer is running
with 8
> > >    threads for this topic (and each consumer has 8 cores available).
> > >    - When I initially asked for help the brokers were configured with
> > >    num.partitions=1, I've since tried higher numbers (3, 64) and
haven't
> > > seen
> > >    much of an improvement aside from forcing both consumer apps to
handle
> > >    messages (with the overall performance not changing much).
> > >    - I ran into this article
> > >
> > >
> >
http://riccomini.name/posts/kafka/2012-10-05-kafka-consumer-memory-tuning/and
> > > tried a variety of options of queuedchunks.max and fetch.size with no
> > >    significant results (simply meaning it did not achieve the goal of
me
> > >    constantly processing hundreds or thousands of messages per second,
> > > which
> > >    is similar to the rate of input). I would not be surprised if I'm
> > wrong
> > > but
> > >    this made me start to think that the problem may lie outside of the
> > >    consumers
> > >    - Would the combination of a high number of partitions (64) and a
high
> > >    log.flush.interval (10k) prevent logs from flushing as often as
they
> > > need
> > >    to for my desired rate of consumption (even with
> > >    log.default.flush.interval.ms=1000?)
> > >
> > > Despite the changes I mentioned the behaviour is still the consumers
> > > receiving larger spikes of messages mixed with periods of complete
> > > inactivity and overall a long delay between messages being written and
> > > messages being read (about 2 minutes). Anyway... as always I greatly
> > > appreciate any help.
> > >
> > > On Sun, Apr 21, 2013 at 8:50 PM, Jun Rao <jun...@gmail.com> wrote:
> > >
> > > > Is your network shared? Is so, another possibility is that some
other
> > > apps
> > > > are consuming the bandwidth.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Sun, Apr 21, 2013 at 12:23 PM, Andrew Neilson <
arsneil...@gmail.com
> > > > >wrote:
> > > >
> > > > > Thanks very much for the reply Neha! So I swapped out the consumer
> > that
> > > > > processes the messages with one that just prints them. It does
indeed
> > > > > achieve a much better rate at peaks but can still nearly zero out
(if
> > > not
> > > > > completely zero out). I plotted the messages printed in graphite
to
> > > show
> > > > > the behaviour I'm seeing (this is messages printed per second):
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
https://www.dropbox.com/s/7u7uyrefw6inetu/Screen%20Shot%202013-04-21%20at%2011.44.38%20AM.png
> > > > >
> > > > > The peaks are over ten thousand per second and the troughs can go
> > below
> > > > 10
> > > > > per second just prior to another peak. I know that there are
plenty
> > of
> > > > > messages available because the ones currently being processed are
> > still
> > > > > from Friday afternoon, so this may or may not have something to do
> > with
> > > > > this pattern.
> > > > >
> > > > > Is there anything I can do to avoid the periods of lower
performance?
> > > > > Ideally I would be processing messages as soon as they are
written.
> > > > >
> > > > >
> > > > > On Sun, Apr 21, 2013 at 8:49 AM, Neha Narkhede <
> > > neha.narkh...@gmail.com
> > > > > >wrote:
> > > > >
> > > > > > Some of the reasons a consumer is slow are -
> > > > > > 1. Small fetch size
> > > > > > 2. Expensive message processing
> > > > > >
> > > > > > Are you processing the received messages in the consumer ? Have
you
> > > > > > tried running console consumer for this topic and see how it
> > performs
> > > > > > ?
> > > > > >
> > > > > > Thanks,
> > > > > > Neha
> > > > > >
> > > > > > On Sun, Apr 21, 2013 at 1:59 AM, Andrew Neilson <
> > > arsneil...@gmail.com>
> > > > > > wrote:
> > > > > > > I am currently running a deployment with 3 brokers, 3 ZK, 3
> > > > producers,
> > > > > 2
> > > > > > > consumers, and 15 topics. I should first point out that this
is
> > my
> > > > > first
> > > > > > > project using Kafka ;). The issue I'm seeing is that the
> > consumers
> > > > are
> > > > > > only
> > > > > > > processing about 15 messages per second from what should be
the
> > > > largest
> > > > > > > topic it is consuming (we're sending 200-400 ~300 byte
messages
> > per
> > > > > > second
> > > > > > > to this topic). I should note that I'm using a high level ZK
> > > consumer
> > > > > and
> > > > > > > ZK 3.4.3.
> > > > > > >
> > > > > > > I have a strong feeling I have not configured things properly
so
> > I
> > > > > could
> > > > > > > definitely use some guidance. Here is my broker configuration:
> > > > > > >
> > > > > > > brokerid=1
> > > > > > > port=9092
> > > > > > > socket.send.buffer=1048576
> > > > > > > socket.receive.buffer=1048576
> > > > > > > max.socket.request.bytes=104857600
> > > > > > > log.dir=/home/kafka/data
> > > > > > > num.partitions=1
> > > > > > > log.flush.interval=10000
> > > > > > > log.default.flush.interval.ms=1000
> > > > > > > log.default.flush.scheduler.interval.ms=1000
> > > > > > > log.retention.hours=168
> > > > > > > log.file.size=536870912
> > > > > > > enable.zookeeper=true
> > > > > > > zk.connect=XXX
> > > > > > > zk.connectiontimeout.ms=1000000
> > > > > > >
> > > > > > > Here is my producer config:
> > > > > > >
> > > > > > > zk.connect=XXX
> > > > > > > producer.type=async
> > > > > > > compression.codec=0
> > > > > > >
> > > > > > > Here is my consumer config:
> > > > > > >
> > > > > > > zk.connect=XXX
> > > > > > > zk.connectiontimeout.ms=100000
> > > > > > > groupid=XXX
> > > > > > > autooffset.reset=smallest
> > > > > > > socket.buffersize=1048576
> > > > > > > fetch.size=10485760
> > > > > > > queuedchunks.max=10000
> > > > > > >
> > > > > > > Thanks for any assistance you can provide,
> > > > > > >
> > > > > > > Andrew
> > > > > >
> > > > >
> > > >
> > >
> >

Re: seeing poor consumer performance in 0.7.2

Reply via email to