Dan Swanson created KAFKA-975: --------------------------------- Summary: Leader not local for partition when partition is leader Key: KAFKA-975 URL: https://issues.apache.org/jira/browse/KAFKA-975 Project: Kafka Issue Type: Bug Components: replication Affects Versions: 0.8 Environment: centos 6.4 Reporter: Dan Swanson Assignee: Neha Narkhede
I have a two server kafka cluster (dev003 and dev004). I am following the example from this URL but using two servers with a single kafka instance instead of using 1 server with two instances.. http://www.michael-noll.com/blog/2013/03/13/running-a-multi-broker-apache-kafka-cluster-on-a-single-node/ Using the following trunk version commit c27c768463a5dc6be113f2e5b3e00bf8d9d9d602 Author: David Arthur <mum...@gmail.com> Date: Thu Jul 11 15:34:57 2013 -0700 KAFKA-852, remove clientId from Offset{Fetch,Commit}Response. Reviewed by Jay. ------ [2013-07-16 10:56:50,279] INFO [Kafka Server 3], started (kafka.server.KafkaServer) ------ dan@linux-rr29:~/git-data/kafka-current-src> bin/kafka-topics.sh --zookeeper dev003:2181 --create --topic dadj1 --partitions 1 --replication-factor 2 2>/dev/null Created topic "dadj1". dan@linux-rr29:~/git-data/kafka-current-src> ------- [2013-07-16 10:56:57,946] INFO [Replica Manager on Broker 3]: Handling LeaderAndIsr request Name:LeaderAndIsrRequest;Version:0;Controller:4;ControllerEpoch:19;CorrelationId:12;ClientId:id_4-host_dev004-port_9092;PartitionState:(dadj1,0) -> (LeaderAndIsrInfo:(Leader:3,ISR:3,4,LeaderEpoch:0,ControllerEpoch:19),ReplicationFactor:2),AllReplicas:3,4);Leaders:id:3,host:dev003,port:9092 (kafka.server.ReplicaManager) [2013-07-16 10:56:57,959] INFO [ReplicaFetcherManager on broker 3] Removing fetcher for partition [dadj1,0] (kafka.server.ReplicaFetcherManager) [2013-07-16 10:57:21,196] WARN [KafkaApi-3] Produce request with correlation id 2 from client on partition [dadj1,0] failed due to Leader not local for partition [dadj1,0] on broker 3 (kafka.server.KafkaApis) ----- dan@linux-rr29:~/git-data/kafka-current-src> bin/kafka-topics.sh --zookeeper dev003:2181 --describe --topic dadj1 2>/dev/null dadj1 configs: partitions: 1 topic: dadj1 partition: 0 leader: 3 replicas: 3,4 isr: 3,4 dan@linux-rr29:~/git-data/kafka-current-src> Dev003 logs show that server is elected as leader and has correct id of 3, zookeeper shows dev003 is leader, but when I try to produce to the topic I get a failure because the server thinks it is not the leader. This occurs regardless of which server (dev003 or dev004) ends up the leader. Here is my config which is the same except for the broker id and host names [root@dev003 kafka-current-src]# grep -v -e '^#' -e '^$' config/server.properties broker.id=3 port=9092 host.name=dev003 num.network.threads=2 num.io.threads=2 socket.send.buffer.bytes=1048576 socket.receive.buffer.bytes=1048576 socket.request.max.bytes=104857600 log.dir=/opt/kafka/data/8.0/ num.partitions=1 log.flush.interval.messages=10000 log.flush.interval.ms=1000 log.retention.hours=168 log.segment.bytes=536870912 log.cleanup.interval.mins=1 zookeeper.connect=10.200.8.61:2181,10.200.8.62:2181,10.200.8.63:2181 zookeeper.connection.timeout.ms=1000000 kafka.metrics.polling.interval.secs=5 kafka.metrics.reporters=kafka.metrics.KafkaCSVMetricsReporter kafka.csv.metrics.dir=/tmp/kafka_metrics kafka.csv.metrics.reporter.enabled=false [root@dev003 kafka-current-src]# -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira