And this is the first message in server.log after that the screw up started
[2017-12-09 03:10:49,947] ERROR [KafkaApi-1] Error when handling request {controller_id=0,controller_epoch=1,partition_states=[{topic=LIVETOPIC,partition=31,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=LIVETOPIC,partition=9,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=__consumer_offsets,partition=27,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=__consumer_offsets,partition=19,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVETOPICOLD,partition=10,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVETOPICOLD,partition=32,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=__consumer_offsets,partition=13,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=LIVETOPIC,partition=17,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=__consumer_offsets,partition=5,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPICOLD,partition=18,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=LIVETOPICOLD,partition=45,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPIC,partition=3,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=LIVETOPIC,partition=30,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVETOPICOLD,partition=4,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=__consumer_offsets,partition=48,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVETOPIC,partition=44,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=__consumer_offsets,partition=40,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=LIVETOPICOLD,partition=31,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=LIVETOPIC,partition=16,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPIC,partition=38,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=__consumer_offsets,partition=34,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=LIVETOPICOLD,partition=17,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[6,3,4]},{topic=LIVETOPICOLD,partition=39,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=__consumer_offsets,partition=26,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[6,4,5]},{topic=LIVETOPIC,partition=24,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=LIVETOPIC,partition=2,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=__consumer_offsets,partition=20,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[0,4,5]},{topic=__consumer_offsets,partition=12,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVETOPICOLD,partition=3,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPICOLD,partition=25,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[0,5,6]},{topic=LIVETOPIC,partition=10,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=__consumer_offsets,partition=6,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVETOPICOLD,partition=11,controller_epoch=1,leader=3,leader_epoch=1,isr=[3,4],zk_version=1,replicas=[0,3,4]},{topic=__consumer_offsets,partition=47,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[6,1,2]},{topic=LIVETOPICOLD,partition=38,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[6,0,1]},{topic=__consumer_offsets,partition=41,controller_epoch=1,leader=1,leader_epoch=1,isr=[1,2],zk_version=1,replicas=[0,1,2]},{topic=LIVETOPIC,partition=23,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[6,2,3]},{topic=LIVETOPIC,partition=45,controller_epoch=1,leader=1,leader_epoch=1,isr=[1],zk_version=1,replicas=[0,6,1]},{topic=__consumer_offsets,partition=33,controller_epoch=1,leader=5,leader_epoch=1,isr=[5],zk_version=1,replicas=[6,5,0]},{topic=LIVETOPICOLD,partition=24,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[6,4,5]},{topic=LIVETOPICOLD,partition=46,controller_epoch=1,leader=2,leader_epoch=1,isr=[2,3],zk_version=1,replicas=[0,2,3]},{topic=LIVETOPIC,partition=37,controller_epoch=1,leader=4,leader_epoch=1,isr=[4,5],zk_version=1,replicas=[6,4,5]}],live_brokers=[{id=2,end_points=[{port=9095,host=1.1.1.2,security_protocol_type=0}],rack=null},{id=5,end_points=[{port=9098,host=1.1.1.2,security_protocol_type=0}],rack=null},{id=3,end_points=[{port=9096,host=1.1.1.2,security_protocol_type=0}],rack=null},{id=4,end_points=[{port=9097,host=1.1.1.2,security_protocol_type=0}],rack=null},{id=1,end_points=[{port=9094,host=1.1.1.2,security_protocol_type=0}],rack=null}]} (kafka.server.KafkaApis) On Sat, Dec 9, 2017 at 11:45 AM, Abhit [AxesTrack] < abhit.kalso...@axestrack.com> wrote: > Dear Krishna > > What kind of NW problems ? And are you talking about > zookeeper.connet.timeout ? By defualt it's 6000 > > > On Dec 9, 2017 10:49, "R Krishna" <krishna...@gmail.com> wrote: > > This is a known issue for us in 0.10 due to network related problems with > ZK causing no leader exception and restarting quickly fixed it. You can > increase time out to alleviate the problem a bit. > > On Dec 8, 2017 8:20 PM, "Abhit Kalsotra" <abhit...@gmail.com> wrote: > > > Guys can I get any reply of help on the same.. this has been occuring > very > > frequently in my production environment.. Please help.. > > > > Abhi > > > > On Dec 6, 2017 13:24, "Abhit Kalsotra" <abhit...@gmail.com> wrote: > > > > > Hello * > > > > > > I am running Kafka(*0.10.2.0*) on windows from the past one year ... > > > > > > But off late there has been unique Broker issues that I have observed > 4-5 > > > times in > > > last 4 months. > > > > > > Kafka setup cofig... > > > > > > > > > *3 ZK Instances running on 3 different Windows Servers, 7 Kafka Broker > > > nodes running on single windows machine with different disk for logs > > > directory.... * > > > > > > *My Kafka has 2 Topics with partition size 50 each , and replication > > > factor of 3.* > > > > > > *My partition logic selection*: Each message has a unique ID and logic > of > > > selecting partition is ( unique ID % 50), and then calling Kafka > producer > > > API to route a specific message to a particular topic partition . > > > > > > But of-late there has been a unique case that's cropping out in Kafka > > > broker nodes, > > > [2017-12-02 02:47:40,024] ERROR [ReplicaFetcherThread-0-4], Error for > > > partition [__consumer_offsets,15] to broker 4:org.apache.kafka.common. > > > errors.NotLeaderForPartitionException: This server is not the leader > for > > > that topic-partition. (kafka.server.ReplicaFetcherThread) > > > > > > The entire server.log is filled with these logs, and its very huge too > , > > > please help me in understanding under what circumstances these can > occur, > > > and what measures I need to take.. > > > > > > Courtesy > > > Abhi > > > !wq > > > > > > -- > > > > > > If you can't succeed, call it version 1.0 > > > > > > > > -- If you can't succeed, call it version 1.0