Chen, Consumers lag either due to an I/O or network bottleneck or due to slow processing of messages by the user. To confirm that you are not hitting the latter issue, you can run a console consumer on the same data and observe the throughput that it provides and it's lag.
Thanks, Neha On Wed, Nov 5, 2014 at 3:31 PM, Chen Wang <chen.apache.s...@gmail.com> wrote: > Guozhang, > I can see message keep coming, meaning messages are being consumed, right? > But the lag is pretty huge (average 30m messages behind) as you can see > from the graph: > > https://www.dropbox.com/s/xli25zicxv5f2qa/Screenshot%202014-11-05%2015.23.05.png?dl=0 > > My understanding is that for such light weight thread, the consumer should > almost be at the same pace with the producer. I also checked the machine > metrics, and nothing pegged there. > > I am also moving the testing application to a separate dev cluster. In your > experience, what things might cause the slow reading? Is this more like a > server side thing, or consumer side? > > Chen > > On Wed, Nov 5, 2014 at 3:10 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > Chen, > > > > Your configs seems fine. > > > > Could you use ConsumerOffsetChecker tool to see if the offset is > advancing > > at all (i.e. messages are comsumed), and if yes get some thread dumps and > > check if your consumer is blocked on some locks? > > > > Guozhang > > > > On Wed, Nov 5, 2014 at 2:01 PM, Chen Wang <chen.apache.s...@gmail.com> > > wrote: > > > > > Hey Guys, > > > I have a really simply storm topology with a kafka spout, reading from > > > kafka through high level consumer. Since the topic has 30 partitions, > we > > > have 30 threads in the spout reading from it. However, it seems that > the > > > lag keeps increasing even the thread only read the message and do > > nothing. > > > The largest message size are around 30KB, and the incoming rate can be > > as > > > hight as 14k/seconds. There are 3 brokers on some high config bare > metal > > > machines. The client side config is like this: > > > > > > kafka.config.fetch.message.max.bytes 3145728 > > > kafka.config.group.id spout_readonly > > > kafka.config.rebalance.backoff.ms 6000 > > > kafka.config.rebalance.max.retries 6 > > > kafka.config.zookeeper.connect dare-broker00.sv.walmartlabs.com:2181, > > > dare-broker01.sv.walmartlabs.com:2181, > > > dare-broker02.sv.walmartlabs.com:2181 > > > kafka.config.zookeeper.session.timeout.ms 60000 > > > > > > what could possibly cause this huge lag? Will broker be a bottle neck, > or > > > some config need to be adjusted? The server side config is like this: > > > > > > replica.fetch.max.bytes=2097152 > > > message.max.bytes=2097152 > > > num.network.threads=4 > > > num.io.threads=4 > > > > > > # The send buffer (SO_SNDBUF) used by the socket server > > > socket.send.buffer.bytes=4194304 > > > > > > # The receive buffer (SO_RCVBUF) used by the socket server > > > socket.receive.buffer.bytes=2097152 > > > > > > # The maximum size of a request that the socket server will accept > > > (protection against OOM) > > > socket.request.max.bytes=104857600 > > > > > > Any help appreciated! > > > Chen > > > > > > > > > > > -- > > -- Guozhang > > >