[ https://issues.apache.org/jira/browse/KAFKA-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chandra kasiraju resolved KAFKA-6613. ------------------------------------- Resolution: Fixed Fix Version/s: (was: 0.11.0.2) 1.0.0 > The controller shouldn't stop partition reassignment after an exception is > being thrown > --------------------------------------------------------------------------------------- > > Key: KAFKA-6613 > URL: https://issues.apache.org/jira/browse/KAFKA-6613 > Project: Kafka > Issue Type: Bug > Components: admin, config, controller, core > Affects Versions: 0.11.0.2 > Reporter: chandra kasiraju > Priority: Major > Fix For: 1.0.0 > > > I issued a partition reassignment command . It created the following entries > in the zookeeper . > But the entry never gets deleted because the partition reassigment hangs gets > some exceptions in kafka logs . After that no matter how many hours the > movement of partitions to other brokers never happens . > > *Path in Zookeeper* > get /admin/reassign_partitions > {"version":1,"partitions":[{"topic":"__consumer_offsets","partition":44,"replicas":([1003,1001,1004,1002]},\{"topic":"683ad5e0-3775-4adc-ab55-82fda0761ba9_newTopic9","partition":0,"replicas":[1003,1004,1001,1002]},\{"topic":"683ad5e0-3775-4adc-ab55-82fda0761ba9_newTopic1","partition":0,"replicas":[1003,1004,1001,1002]},\{"topic":"__CruiseControlMetrics","partition":0,"replicas":[1002,1001,1004,1003]},\{"topic":"b1c39c85-aee5-4ea0-90a1-9fc7eedc635b_topic","partition":0,"replicas":[1003,1004,1001,1002]},\{"topic":"88ec4bd5-e149-4c98-8e8e-952e86ba5fae_topic","partition":4,"replicas":[1002,1004,1003,1001]},\{"topic":"c8c56723-73a5-4a37-93bf-b8ecaf766429_topic","partition":4,"replicas":[1002,1003,1004,1001]},\{"topic":"683ad5e0-3775-4adc-ab55-82fda0761ba9_newTopic9","partition":4,"replicas":[1002,1004,1003,1001]},\{"topic":"b1c39c85-aee5-4ea0-90a1-9fc7eedc635b_topic","partition":4,"replicas":[1003,1001,1004,1002]},\{"topic":"9db0cad2-69f8-4e85-b663-cd3987bd90fe_topic","partition":0,"replicas":[1003,1001,1004]},\{"topic":"683ad5e0-3775-4adc-ab55-82fda0761ba9_topic","partition":1,"replicas":[1003,1004,1001,1002]}]} > cZxid = 0x5000052f8 > ctime = Tue Mar 06 01:27:54 UTC 2018 > mZxid = 0x500005359 > mtime = Tue Mar 06 01:28:06 UTC 2018 > pZxid = 0x5000052f8 > cversion = 0 > dataVersion = 13 > aclVersion = 0 > ephemeralOwner = 0x0 > dataLength = 1114 > numChildren = 0 > > > *Exception* > > ERROR [KafkaApi-1002] Error when handling request > \{replica_id=1005,max_wait_time=500,min_bytes=1,max_bytes=10485760,isolation_level=0,topics=[{topic=__consumer_offsets,partitions=[{partition=41,fetch_offset=0,log_start_offset=0,max_bytes=1048576}]}]} > (kafka.server.KafkaApis) > kafka.common.NotAssignedReplicaException: Leader 1002 failed to record > follower 1005's position 0 since the replica is not recognized to be one of > the assigned replicas 1001,1002,1004 for partition __consumer_offsets-41. > at kafka.cluster.Partition.updateReplicaLogReadResult(Partition.scala:274) > at > kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:1092) > at > kafka.server.ReplicaManager$$anonfun$updateFollowerLogReadResults$2.apply(ReplicaManager.scala:1089) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at > kafka.server.ReplicaManager.updateFollowerLogReadResults(ReplicaManager.scala:1089) > at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:623) > at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:606) > at kafka.server.KafkaApis.handle(KafkaApis.scala:98) > at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:66) > at java.lang.Thread.run(Thread.java:745) > > > > I was expecting it would be recover from that exception move the partitions > to other nodes and finally remove the entries in /admin/reassign_partitions > after the move has happened. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)