Albert Strasheim created KAFKA-1509:
---------------------------------------

             Summary: Restart of destination broker after partition move leaves 
partitions without leader
                 Key: KAFKA-1509
                 URL: https://issues.apache.org/jira/browse/KAFKA-1509
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 0.8.1.1
            Reporter: Albert Strasheim


This should be reasonably easy to reproduce.

Make a Kafka cluster with a few machines.

Create a topic with partitions on these machines. No replication.

Bring up one more Kafka node.

Move some or all of the partitions onto this new broker:

kafka-reassign-partitions.sh --generate --zookeeper zk:2181 
--topics-to-move-json-file move.json --broker-list <new broker>

kafka-reassign-partitions.sh --zookeeper 36cfqd1.in.cfops.it:2181 
--reassignment-json-file reassign.json --execute

Wait until broker is the leader for all the partitions you moved.

Send some data to the partitions. It all works.

Shut down the broker that just received the data. Start it back up.
 
{code}
Topic:test      PartitionCount:2        ReplicationFactor:1     Configs:
        Topic: test     Partition: 0    Leader: -1      Replicas: 7     Isr: 
        Topic: test     Partition: 1    Leader: -1      Replicas: 7     Isr: 
{code}

Leader for topic test never gets elected even though this node is the only node 
knows about the topic.

Some logs:
{code}
Jun 26 23:18:07 localhost kafka: INFO [Socket Server on Broker 7], Started 
(kafka.network.SocketServer)
Jun 26 23:18:07 localhost kafka: INFO [Socket Server on Broker 7], Started 
(kafka.network.SocketServer)
Jun 26 23:18:07 localhost kafka: INFO [ControllerEpochListener on 7]: 
Initialized controller epoch to 53 and zk version 52 
(kafka.controller.ControllerEpochListener)
Jun 26 23:18:07 localhost kafka: INFO Will not load MX4J, mx4j-tools.jar is not 
in the classpath (kafka.utils.Mx4jLoader$)
Jun 26 23:18:07 localhost kafka: INFO Will not load MX4J, mx4j-tools.jar is not 
in the classpath (kafka.utils.Mx4jLoader$)
Jun 26 23:18:07 localhost kafka: INFO [Controller 7]: Controller starting up 
(kafka.controller.KafkaController)
Jun 26 23:18:07 localhost kafka: INFO conflict in /controller data: 
{"version":1,"brokerid":7,"timestamp":"1403824687354"} stored data: 
{"version":1,"brokerid":4,"timestamp":"1403297911725"} (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO conflict in /controller data: 
{"version":1,"brokerid":7,"timestamp":"1403824687354"} stored data: 
{"version":1,"brokerid":4,"timestamp":"1403297911725"} (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO [Controller 7]: Controller startup 
complete (kafka.controller.KafkaController)
Jun 26 23:18:07 localhost kafka: INFO Registered broker 7 at path 
/brokers/ids/7 with address xxx:9092. (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO Registered broker 7 at path 
/brokers/ids/7 with address xxx:9092. (kafka.utils.ZkUtils$)
Jun 26 23:18:07 localhost kafka: INFO [Kafka Server 7], started 
(kafka.server.KafkaServer)
Jun 26 23:18:07 localhost kafka: INFO [Kafka Server 7], started 
(kafka.server.KafkaServer)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:3)
 for partition [requests,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:1)
 for partition [requests,13] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:4,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:1,5)
 for partition [requests_ipv6,5] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:13,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,5)
 for partition [requests_stored,7] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:5,ISR:5,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:5)
 for partition [requests,6] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:7,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:1,4)
 for partition [requests_ipv6,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:6,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,1)
 for partition [requests_ipv6,6] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:5,ISR:5,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:5)
 for partition [requests,10] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:1,5)
 for partition [requests_stored,4] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,2)
 for partition [requests_stored,1] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:13,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,4)
 for partition [requests_stored,6] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:1)
 for partition [requests,14] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:5,ISR:5,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:5)
 for partition [requests,2] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test3,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:2)
 for partition [requests,3] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:9,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,2)
 for partition [requests_ipv6,7] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:9,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,1)
 for partition [requests_ipv6,2] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:5,3)
 for partition [requests_ipv6,4] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:15,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:4)
 for partition [requests,5] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:1)
 for partition [requests,4] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:4,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,5)
 for partition [requests_ipv6,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:15,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:4)
 for partition [requests,9] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,2)
 for partition [requests_ipv6,3] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:15,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:2)
 for partition [requests,11] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:10,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,3)
 for partition [requests_stored,5] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,2)
 for partition [requests_error,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test3,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:5,4)
 for partition [requests_stored,3] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:4,ISR:4,LeaderEpoch:17,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:4,3)
 for partition [requests_stored,2] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:2,ISR:2,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:2)
 for partition [requests,7] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:3,1)
 for partition [requests_error,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:1,ISR:1,LeaderEpoch:11,ControllerEpoch:53),ReplicationFactor:2),AllReplicas:2,1)
 for partition [requests_stored,0] in response to UpdateMetadata request sent 
by controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:6,ISR:6,LeaderEpoch:21,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:6)
 for partition [requests,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:6,ISR:6,LeaderEpoch:24,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:6)
 for partition [requests,12] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:3,ISR:3,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:3)
 for partition [requests,8] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 70 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 correlation id 71 from controller 4 epoch 53 for partition [test,0] 
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 correlation id 71 from controller 4 epoch 53 for partition [test3,1] 
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 correlation id 71 from controller 4 epoch 53 for partition [test,1] 
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 received LeaderAndIsr request 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 correlation id 71 from controller 4 epoch 53 for partition [test3,0] 
(state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 starting the become-follower 
transition for partition [test,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 starting the become-follower 
transition for partition [test3,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 starting the become-follower 
transition for partition [test,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 handling LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 starting the become-follower 
transition for partition [test3,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower 
state change with correlation id 71 from controller 4 epoch 53 for partition 
[test,0] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower 
state change with correlation id 71 from controller 4 epoch 53 for partition 
[test3,1] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower 
state change with correlation id 71 from controller 4 epoch 53 for partition 
[test,1] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: ERROR Broker 7 aborted the become-follower 
state change with correlation id 71 from controller 4 epoch 53 for partition 
[test3,0] since new leader -1 is not currently available (state.change.logger)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7] 
Removed fetcher for partitions  (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7] 
Removed fetcher for partitions  (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7] Added 
fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: INFO [ReplicaFetcherManager on broker 7] Added 
fetcher for partitions List() (kafka.server.ReplicaFetcherManager)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 for the become-follower transition 
for partition [test,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 for the become-follower transition 
for partition [test3,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 for the become-follower transition 
for partition [test,1] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 completed LeaderAndIsr request 
correlationId 71 from controller 4 epoch 53 for the become-follower transition 
for partition [test3,0] (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 71 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:8,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test3,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 71 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:5,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test,1] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 71 (state.change.logger)
Jun 26 23:18:07 localhost kafka: TRACE Broker 7 cached leader info 
(LeaderAndIsrInfo:(Leader:-1,ISR:,LeaderEpoch:14,ControllerEpoch:53),ReplicationFactor:1),AllReplicas:7)
 for partition [test3,0] in response to UpdateMetadata request sent by 
controller 4 epoch 53 with correlation id 71 (state.change.logger)
{code}




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to