Hi, I am seeing consumer re-balances very frequently and getting socket reconnect exception. log is given below for more insights
[2014-04-18 16:02:52.061][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][SyncProducer:67] Disconnecting from kafka_leader_coms07.snapdeal.com:9092 [2014-04-18 16:02:52.103][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherManager:67] [ConsumerFetcherManager-1397812122507] *Stopping leader finder thread* [2014-04-18 16:02:52.104][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherManager$LeaderFinderThread:67] [kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-leader-finder-thread], Shutting down [2014-04-18 16:02:52.105][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherManager$LeaderFinderThread:67] [kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-leader-finder-thread], Shutdown completed [2014-04-18 16:02:52.106][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherManager:67] [ConsumerFetcherManager-1397812122507]* Stopping all fetchers* [2014-04-18 16:02:52.106][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherThread:67] [ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-0], Shutting down [2014-04-18 16:02:52.107][ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-*timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-0][INFO][SimpleConsumer:75] Reconnect due to socket error: * *java.nio.channels.ClosedByInterruptException* * at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)* * at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:386)* * at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:220)* * at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)* at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) at kafka.utils.Utils$.read(Utils.scala:394) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:108) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) [2014-04-18 16:02:52.108][ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-0][INFO][ConsumerFetcherThread:67] [ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-0], Stopped [2014-04-18 16:02:52.109][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-*1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherThread:67] [ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-0], Shutdown completed* [2014-04-18 16:02:52.171][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663_watcher_executor][INFO][ConsumerFetcherThread:67] [ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-2], Shutting down [2014-04-18 16:02:52.109][kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-leader-finder-thread][INFO][ConsumerFetcherManager$LeaderFinderThread:67] [kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-leader-finder-thread], Stopped [*2014-04-18 16:02:52.171][ConsumerFetcherThread-kafka.coms.consumer.kafka_topic_coms_esb_prod_coms.coms-timemachine.coms.coms04.snapdeal.com_coms04.snapdeal.com-1397812122323-509d9663-0-2][INFO][SimpleConsumer:75] Reconnect due to socket error: * *java.nio.channels.ClosedByInterruptException* * at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)* * at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:386)* * at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:220)* * at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)* at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385) at kafka.utils.Utils$.read(Utils.scala:394) at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:67) at kafka.network.Receive$class.readCompletely(Transmission.scala:56) at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29) at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100) at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:73) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:71) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SimpleConsumer.scala:110) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1$$anonfun$apply$mcV$sp$1.apply(SimpleConsumer.scala:110) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply$mcV$sp(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109) at kafka.consumer.SimpleConsumer$$anonfun$fetch$1.apply(SimpleConsumer.scala:109) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) at kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:108) at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:96) at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:88) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) currently we are using 3 node kafka cluster(0.8.0_beta) with 1 topic of 100 partition. According to kakfa documentation, GC maybe the reason of too may rebalances so i monitored my app with jstat, couldn't any issue over there. [coms@coms04 coms-timemachine]$ jstat -gcutil 7419 1000 100 S0 S1 E O P YGC YGCT FGC FGCT GCT 35.09 8.88 100.00 66.89 90.73 1119 820.300 102 1093.934 1914.234 0.00 59.06 21.02 70.93 90.89 1119 821.362 102 1093.934 1915.296 0.00 59.06 84.61 70.93 91.85 1119 821.362 102 1093.934 1915.296 35.08 59.06 100.00 79.29 92.00 1120 821.362 102 1093.934 1915.296 53.67 0.00 43.22 80.63 92.55 1120 822.511 102 1093.934 1916.445 53.67 0.00 89.50 80.63 93.55 1120 822.511 102 1093.934 1916.445 53.67 31.42 100.00 86.79 93.97 1121 822.511 102 1093.934 1916.445 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 40.26 0.00 89.55 93.97 1121 823.580 103 1093.934 1917.514 0.00 0.00 65.02 41.18 90.02 1121 823.580 103 1104.702 1928.282 36.43 0.00 4.34 41.18 90.58 1122 823.979 103 1104.702 1928.681 36.43 0.00 69.49 41.18 91.02 1122 823.979 103 1104.702 1928.681 36.43 26.60 100.00 42.34 91.20 1123 823.979 103 1104.702 1928.681 0.00 35.07 30.98 46.02 91.41 1123 824.749 103 1104.702 1929.451 0.00 35.07 86.93 46.02 91.60 1123 824.749 103 1104.702 1929.451 1.60 0.00 14.58 51.85 91.74 1124 825.198 103 1104.702 1929.900 1.60 0.00 62.30 51.85 91.98 1124 825.198 103 1104.702 1929.900 0.00 9.65 6.49 52.08 92.19 1125 825.360 103 1104.702 1930.063 0.00 9.65 58.28 52.08 92.47 1125 825.360 103 1104.702 1930.063 19.19 9.65 100.00 53.38 92.63 1126 825.360 103 1104.702 1930.063 25.31 0.00 51.89 53.48 92.87 1126 825.784 103 1104.702 1930.487 25.31 11.21 100.00 55.24 93.04 1127 825.784 103 1104.702 1930.487 0.00 99.97 39.63 57.69 93.22 1127 826.513 103 1104.702 1931.216 0.00 99.97 86.22 57.69 93.40 1127 826.513 103 1104.702 1931.216 72.36 99.97 100.00 57.69 93.51 1128 826.513 103 1104.702 1931.216 85.99 0.00 44.82 57.69 93.76 1128 827.339 103 1104.702 1932.041 85.99 0.83 100.00 66.34 94.09 1129 827.339 103 1104.702 1932.041 0.00 36.82 12.21 73.88 94.17 1129 828.588 103 1104.702 1933.290 0.00 36.82 100.00 73.88 94.94 1130 828.588 103 1104.702 1933.290 47.42 0.00 5.23 79.99 95.02 1130 829.497 103 1104.702 1934.199 47.42 0.00 69.50 79.99 96.14 1130 829.497 103 1104.702 1934.199 47.42 19.36 100.00 80.78 97.13 1131 829.497 103 1104.702 1934.199 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 99.99 0.00 87.63 97.13 1131 830.454 104 1104.702 1935.156 0.00 0.00 33.49 59.22 89.53 1131 830.454 104 1119.378 1949.831 0.00 0.00 87.06 59.22 90.67 1131 830.454 104 1119.378 1949.831 24.49 0.00 19.01 59.22 91.22 1132 830.770 104 1119.378 1950.147 24.49 0.00 72.05 59.22 91.46 1132 830.770 104 1119.378 1950.147 24.49 39.18 100.00 59.22 91.64 1133 830.770 104 1119.378 1950.147 0.00 99.99 44.95 59.22 91.82 1133 831.338 104 1119.378 1950.715 0.00 99.99 94.30 59.22 92.17 1133 831.338 104 1119.378 1950.715 58.40 0.00 11.63 59.22 92.22 1134 831.993 104 1119.378 1951.370 58.40 0.00 58.77 59.22 92.46 1134 831.993 104 1119.378 1951.370 58.40 27.41 100.00 59.22 92.55 1135 831.993 104 1119.378 1951.370 0.00 63.03 30.83 59.22 92.71 1135 832.675 104 1119.378 1952.053 0.00 63.03 81.31 59.22 92.94 1135 832.675 104 1119.378 1952.053 55.01 63.03 100.00 59.22 92.98 1136 832.675 104 1119.378 1952.053 79.90 0.00 37.84 59.22 93.14 1136 833.582 104 1119.378 1952.960 79.90 0.00 92.75 59.22 93.24 1136 833.582 104 1119.378 1952.960 79.90 69.59 100.00 59.22 93.30 1137 833.582 104 1119.378 1952.960 0.00 100.00 54.86 59.77 93.65 1137 834.765 104 1119.378 1954.143 31.63 100.00 98.64 60.31 94.04 1138 834.765 104 1119.378 1954.143 100.00 100.00 98.64 63.88 94.04 1138 834.765 104 1119.378 1954.143 100.00 0.00 50.43 65.69 94.50 1138 836.225 104 1119.378 1955.603 100.00 14.88 100.00 66.31 95.25 1139 836.225 104 1119.378 1955.603 100.00 100.00 100.00 67.98 95.25 1139 836.225 104 1119.378 1955.603 0.00 100.00 43.37 72.29 95.82 1139 837.636 104 1119.378 1957.014 24.55 100.00 100.00 76.41 96.40 1140 837.636 104 1119.378 1957.014 93.50 100.00 100.00 80.50 96.40 1140 837.636 104 1119.378 1957.014 100.00 0.00 62.48 82.20 96.79 1140 839.089 104 1119.378 1958.467 100.00 22.78 100.00 88.93 97.05 1141 839.089 104 1119.378 1958.467 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 96.13 0.00 90.89 97.05 1141 840.580 105 1119.378 1959.958 0.00 0.00 53.78 46.66 89.50 1141 840.580 105 1129.520 1970.101 0.27 0.00 100.00 46.66 89.75 1142 840.580 105 1129.520 1970.101 26.45 0.00 33.61 46.66 89.77 1142 840.924 105 1129.520 1970.444 26.45 0.00 84.87 46.66 89.81 1142 840.924 105 1129.520 1970.444 0.00 2.68 16.33 50.87 89.83 1143 841.276 105 1129.520 1970.796 0.00 2.68 63.77 50.87 89.87 1143 841.276 105 1129.520 1970.796 Not sure what is happening over here. any leads would be helpful.. Regards, Ankit TYagi