Bouncing the consumers should solve this issue in most cases. Thanks,
Mayuresh On Sun, Jul 12, 2015 at 8:21 PM, Jiangjie Qin <j...@linkedin.com.invalid> wrote: > Hi Tao, > > We see this error from time to time but did not think of this as a big > issue. Any reason it bothers you much? > I¹m not sure if throwing exception to user on this exception is a good > handling or not. What are user supposed to do in that case other than > retry? > > Thanks, > > Jiangjie (Becket) Qin > > On 7/12/15, 7:16 PM, "tao xiao" <xiaotao...@gmail.com> wrote: > > >We saw the error again in our cluster. Anyone has the same issue before? > > > >On Fri, 10 Jul 2015 at 13:26 tao xiao <xiaotao...@gmail.com> wrote: > > > >> Bump the thread. Any help would be appreciated. > >> > >> On Wed, 8 Jul 2015 at 20:09 tao xiao <xiaotao...@gmail.com> wrote: > >> > >>> Additional info > >>> Kafka version: 0.8.2.1 > >>> zookeeper: 3.4.6 > >>> > >>> On Wed, 8 Jul 2015 at 20:07 tao xiao <xiaotao...@gmail.com> wrote: > >>> > >>>> Hi team, > >>>> > >>>> I have 10 high level consumers connecting to Kafka and one of them > >>>>kept > >>>> complaining "conflicted ephemeral node" for about 8 hours. The log was > >>>> filled with below exception > >>>> > >>>> [2015-07-07 14:03:51,615] INFO conflict in > >>>> /consumers/group/ids/test-1435856975563-9a9fdc6c data: > >>>> > >>>>{"version":1,"subscription":{"test.*":1},"pattern":"white_list","timest > >>>>amp":"1436275631510"} > >>>> stored data: > >>>> > >>>>{"version":1,"subscription":{"test.*":1},"pattern":"white_list","timest > >>>>amp":"1436275558570"} > >>>> (kafka.utils.ZkUtils$) > >>>> [2015-07-07 14:03:51,616] INFO I wrote this conflicted ephemeral node > >>>> > >>>>[{"version":1,"subscription":{"test.*":1},"pattern":"white_list","times > >>>>tamp":"1436275631510"}] > >>>> at /consumers/group/ids/test-1435856975563-9a9fdc6c a while back in a > >>>> different session, hence I will backoff for this node to be deleted by > >>>> Zookeeper and retry (kafka.utils.ZkUtils$) > >>>> > >>>> In the meantime zookeeper reported below exception for the same time > >>>>span > >>>> > >>>> 2015-07-07 22:45:09,687 [myid:3] - INFO [ProcessThread(sid:3 > >>>> cport:-1)::PrepRequestProcessor@645] - Got user-level KeeperException > >>>> when processing sessionid:0x44e657ff19c0019 type:create cxid:0x7a26 > >>>> zxid:0x3015f6e77 txntype:-1 reqpath:n/a Error > >>>> Path:/consumers/group/ids/test-1435856975563-9a9fdc6c > >>>>Error:KeeperErrorCode > >>>> = NodeExists for /consumers/group/ids/test-1435856975563-9a9fdc6c > >>>> > >>>> At the end zookeeper timed out the session and consumers triggered > >>>> rebalance. > >>>> > >>>> I know that conflicted ephemeral node warning is to handle a zookeeper > >>>> bug that session expiration and ephemeral node deletion are not done > >>>> atomically but as indicated from zookeeper log the zookeeper never > >>>>got a > >>>> chance to delete the ephemeral node which made me think that the > >>>>session > >>>> was not expired at that time. And for some reason zookeeper fired > >>>>session > >>>> expire event which subsequently invoked ZKSessionExpireListener. I > >>>>was > >>>> just wondering if anyone have ever encountered similar issue before > >>>>and > >>>> what I can do at zookeeper side to prevent this? > >>>> > >>>> Another problem is that createEphemeralPathExpectConflictHandleZKBug > >>>> call is wrapped in a while(true) loop which runs forever until the > >>>> ephemeral node is created. Would it be better that we can employ an > >>>> exponential retry policy with a max number of retries so that it has a > >>>> chance to re-throw the exception back to caller and let caller handle > >>>>it in > >>>> situation like above? > >>>> > >>>> > > -- -Regards, Mayuresh R. Gharat (862) 250-7125