So I think I got to the root of the problem. Thanks for pointing me in the
direction of zookeeper data conflicts.
I turned the log level up to INFO and captured a bunch of conflict messages
from the zookeeper client.
I did an "rmr" on the consumers/ zookeeper node to clear out
any lingering data
Could you find some entries in the log with the key word "conflict"? If yes
could you paste them here?
Guozhang
On Mon, Nov 18, 2013 at 2:56 PM, Drew Goya wrote:
> Also of note, this is all running from within a storm topology, when I kill
> a JVM, another is started very quickly.
>
> Could th
Also of note, this is all running from within a storm topology, when I kill
a JVM, another is started very quickly.
Could this be a problem with a consumer leaving and rejoining within a
small window?
On Mon, Nov 18, 2013 at 2:52 PM, Drew Goya wrote:
> Hey Guozhang, I just forced the error by
Hey Guozhang, I just forced the error by killing one of my consumer JVMs
and I am getting a consumer rebalance failure:
2013-11-18 22:46:54 k.c.ZookeeperConsumerConnector [ERROR]
[bridgeTopology_host-1384493092466-7099d843], error during syncedRebalance
kafka.common.ConsumerRebalanceFailedExceptio
Hello Drew,
Do you see any rebalance failure exceptions in the consumer log?
Guozhang
On Mon, Nov 18, 2013 at 2:14 PM, Drew Goya wrote:
> So I've run into a problem where occasionally, some partitions within a
> topic end up in a "none" owner state for a long time.
>
> I'm using the high-leve
So I've run into a problem where occasionally, some partitions within a
topic end up in a "none" owner state for a long time.
I'm using the high-level consumer on several machines, each consumer has 4
threads.
Normally when I run the ConsumerOffsetChecker, all partitions have owners
and similar l