Hey all, I've recently been having problems with consumer groups rebalancing. I'm using several high level consumers which all belong to the same group. Occasionally one or two of them will get stuck in a rebalance loop. They attempt to rebalance, but the partitions they try to claim are owned. Anyone run into this? Ideas?
I see errors in my zookeeper logs like: 2013-12-17 17:12:31,171 [myid:001] - INFO [ProcessThread(sid:1 cport:-1)::PrepRequestProcessor@627] - Got user-level KeeperException when processing sessionid:0x342e4febc180852 type:create cxid:0x1a9a zxid:0x501390d4b txntype:-1 reqpath:n/a Error Path:/kafka/consumers/trackingGroup/owners/Events2/25 Error:KeeperErrorCode = NodeExists for /kafka/consumers/trackingGroup/owners/Events2/25 And errors in my kafka logs like: 2013-12-17 17:20:32 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], begin rebalancing consumer trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306 try #8 2013-12-17 17:20:33 ConsumerFetcherManager [INFO] [ConsumerFetcherManager-1387249530381] Stopping leader finder thread 2013-12-17 17:20:33 ConsumerFetcherManager [INFO] [ConsumerFetcherManager-1387249530381] Stopping all fetchers 2013-12-17 17:20:33 ConsumerFetcherManager [INFO] [ConsumerFetcherManager-1387249530381] All connections stopped 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Cleared all relevant queues for this fetcher 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Cleared the data chunks in all the consumer message iterators 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Committing all offsets after clearing the fetcher queues 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Releasing partition ownership 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Consumer trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306 rebalancing the following partitions: List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127) for topic Events2 with consumers: List(trackingGroup_prod-storm-sup-trk001-1387249529775-2a8484f1-0, trackingGroup_prod-storm-sup-trk001-1387249529775-2a8484f1-1, trackingGroup_prod-storm-sup-trk002-1387249530831-97c586ab-0, trackingGroup_prod-storm-sup-trk002-1387249530831-97c586ab-1, trackingGroup_prod-storm-sup-trk003-1387249529739-f2de3dd9-0, trackingGroup_prod-storm-sup-trk003-1387249529739-f2de3dd9-1, trackingGroup_prod-storm-sup-trk004-1387249530445-8f57ec5c-0, trackingGroup_prod-storm-sup-trk004-1387249530445-8f57ec5c-1, trackingGroup_prod-storm-sup-trk005-1387249530451-d59c669a-0, trackingGroup_prod-storm-sup-trk005-1387249530451-d59c669a-1, trackingGroup_prod-storm-sup-trk005-1387249530452-2b244683-0, trackingGroup_prod-storm-sup-trk005-1387249530452-2b244683-1, trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-0, trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-1, trackingGroup_prod-storm-sup-trk007-1387249529436-fb79e4c8-0, trackingGroup_prod-storm-sup-trk007-1387249529436-fb79e4c8-1, trackingGroup_prod-storm-sup-trk008-1387249526700-11ba655b-0, trackingGroup_prod-storm-sup-trk008-1387249526700-11ba655b-1, trackingGroup_prod-storm-sup-trk009-1387249530020-cb36831c-0, trackingGroup_prod-storm-sup-trk009-1387249530020-cb36831c-1, trackingGroup_prod-storm-sup-trk010-1387249529975-d43aff06-0, trackingGroup_prod-storm-sup-trk010-1387249529975-d43aff06-1, trackingGroup_prod-storm-sup-trk011-1387249527684-479a04f9-0, trackingGroup_prod-storm-sup-trk011-1387249527684-479a04f9-1, trackingGroup_prod-storm-sup-trk012-1387249530208-155ecd68-0, trackingGroup_prod-storm-sup-trk012-1387249530208-155ecd68-1, trackingGroup_prod-storm-sup-trk013-1387249530700-b323ee53-0, trackingGroup_prod-storm-sup-trk013-1387249530700-b323ee53-1, trackingGroup_prod-storm-sup-trk014-1387249529916-e32e6363-0, trackingGroup_prod-storm-sup-trk014-1387249529916-e32e6363-1, trackingGroup_prod-storm-sup-trk015-1387249529709-d655ccd4-0, trackingGroup_prod-storm-sup-trk015-1387249529709-d655ccd4-1, trackingGroup_prod-storm-sup-trk016-1387249531064-bc8f8f3e-0, trackingGroup_prod-storm-sup-trk016-1387249531064-bc8f8f3e-1, trackingGroup_prod-storm-sup-trk017-1387249530635-35f505b7-0, trackingGroup_prod-storm-sup-trk017-1387249530635-35f505b7-1, trackingGroup_prod-storm-sup-trk018-1387249530621-84327f5f-0, trackingGroup_prod-storm-sup-trk018-1387249530621-84327f5f-1, trackingGroup_prod-storm-sup-trk019-1387249530418-80afccf9-0, trackingGroup_prod-storm-sup-trk019-1387249530418-80afccf9-1, trackingGroup_prod-storm-sup-trk020-1387249530930-906e99e1-0, trackingGroup_prod-storm-sup-trk020-1387249530930-906e99e1-1, trackingGroup_prod-storm-sup-trk021-1387249529761-705a5bca-0, trackingGroup_prod-storm-sup-trk021-1387249529761-705a5bca-1, trackingGroup_prod-storm-sup-trk022-1387249530347-3d40b4f9-0, trackingGroup_prod-storm-sup-trk022-1387249530347-3d40b4f9-1, trackingGroup_prod-storm-sup-trk023-1387249529067-957d280b-0, trackingGroup_prod-storm-sup-trk023-1387249529067-957d280b-1, trackingGroup_prod-storm-sup-trk024-1387249530625-f8118f02-0, trackingGroup_prod-storm-sup-trk024-1387249530625-f8118f02-1, trackingGroup_prod-storm-sup-trk025-1387249530213-cccfffc8-0, trackingGroup_prod-storm-sup-trk025-1387249530213-cccfffc8-1, trackingGroup_prod-storm-sup-trk046-1387249527798-2164c569-0, trackingGroup_prod-storm-sup-trk046-1387249527798-2164c569-1, trackingGroup_prod-storm-sup-trk047-1387249530559-6b49ce74-0, trackingGroup_prod-storm-sup-trk047-1387249530559-6b49ce74-1, trackingGroup_prod-storm-sup-trk048-1387249529976-aba1e428-0, trackingGroup_prod-storm-sup-trk048-1387249529976-aba1e428-1, trackingGroup_prod-storm-sup-trk050-1387249530465-dc203a62-0, trackingGroup_prod-storm-sup-trk050-1387249530465-dc203a62-1, trackingGroup_prod-storm-sup-trk051-1387249530406-46f7a649-0, trackingGroup_prod-storm-sup-trk051-1387249530406-46f7a649-1, trackingGroup_prod-storm-sup-trk052-1387249530423-e06e4210-0, trackingGroup_prod-storm-sup-trk052-1387249530423-e06e4210-1, trackingGroup_prod-storm-sup-trk054-1387249530369-68e494e6-0, trackingGroup_prod-storm-sup-trk054-1387249530369-68e494e6-1, trackingGroup_prod-storm-sup-trk055-1387249529961-bec0abbc-0, trackingGroup_prod-storm-sup-trk055-1387249529961-bec0abbc-1, trackingGroup_prod-storm-sup-trk056-1387249531590-957c0b49-0, trackingGroup_prod-storm-sup-trk056-1387249531590-957c0b49-1, trackingGroup_prod-storm-sup-trk057-1387249530341-d8476874-0, trackingGroup_prod-storm-sup-trk057-1387249530341-d8476874-1, trackingGroup_prod-storm-sup-trk058-1387249530730-20554b4d-0, trackingGroup_prod-storm-sup-trk058-1387249530730-20554b4d-1) 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-0 attempting to claim partition 24 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-0 attempting to claim partition 25 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-1 attempting to claim partition 26 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-1 attempting to claim partition 27 2013-12-17 17:20:33 ZkUtils$ [INFO] conflict in /consumers/trackingGroup/owners/Events2/25 data: trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-0 stored data: trackingGroup_prod-storm-sup-trk007-1387249529436-fb79e4c8-0 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], waiting for the partition ownership to be deleted: 25 2013-12-17 17:20:33 ZkUtils$ [INFO] conflict in /consumers/trackingGroup/owners/Events2/26 data: trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-1 stored data: trackingGroup_prod-storm-sup-trk007-1387249529436-fb79e4c8-1 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], waiting for the partition ownership to be deleted: 26 2013-12-17 17:20:33 ZkUtils$ [INFO] conflict in /consumers/trackingGroup/owners/Events2/24 data: trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-0 stored data: trackingGroup_prod-storm-sup-trk007-1387249529436-fb79e4c8-0 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], waiting for the partition ownership to be deleted: 24 2013-12-17 17:20:33 ZkUtils$ [INFO] conflict in /consumers/trackingGroup/owners/Events2/27 data: trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306-1 stored data: trackingGroup_prod-storm-sup-trk007-1387249529436-fb79e4c8-1 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], waiting for the partition ownership to be deleted: 27 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], end rebalancing consumer trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306 try #8 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Rebalancing attempt failed. Clearing the cache before the next rebalancing operation is triggered 2013-12-17 17:20:33 ConsumerFetcherManager [INFO] [ConsumerFetcherManager-1387249530381] Stopping leader finder thread 2013-12-17 17:20:33 ConsumerFetcherManager [INFO] [ConsumerFetcherManager-1387249530381] Stopping all fetchers 2013-12-17 17:20:33 ConsumerFetcherManager [INFO] [ConsumerFetcherManager-1387249530381] All connections stopped 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Cleared all relevant queues for this fetcher 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Cleared the data chunks in all the consumer message iterators 2013-12-17 17:20:33 ZookeeperConsumerConnector [INFO] [trackingGroup_prod-storm-sup-trk006-1387249530327-9d15c306], Committing all offsets after clearing the fetcher queues