This is because we populate the key in
ReplicaManager.highWatermarkCheckpoints using the "dirs" config, but look
up the key using log.dir.getParent. So, if you have a trailing slash in the
config, they won't match. This seems a bug that we should fix. Could you
file a jira?

Thanks,

Jun


On Tue, Jul 30, 2013 at 9:18 AM, Maxime Petazzoni <maxime.petazz...@turn.com
> wrote:

> Nope, that's on a pretty standard GNU/Linux Debian system (jessie/sid)
> running a 3.9.8-1 kernel. But you were onto something. Removing the
> trailing slash is my log.dir config value made it work.
>
> I'm not sure why this would have an impact since the log directories seem
> to be correctly parsed as File objects in LogManager.scala:
>
>   val logDirs: Array[File] = config.logDirs.map(new File(_)).toArray
>
> And in both cases the server logs reports that the log for 'test-0' was
> correctly loaded, which means that log directory is also correctly inserted
> into the logs Pool[TopicAndPartition, Log].
>
> That's about as far as my Scala knowledge goes though ;-) Let me know if
> you're able to reproduce the problem when you have a trailing slash as well.
>
> Thanks,
> /Maxime
> --
> Maxime Petazzoni
> Sr. Platform Engineer
> m 408.310.0595
> www.turn.com
>
> ________________________________________
> From: Jun Rao [jun...@gmail.com]
> Sent: Monday, July 29, 2013 9:32 PM
> To: users@kafka.apache.org
> Subject: Re: Leader not local for partition error?
>
> Are you on Windows? We have seen issues like that before on Windows. You
> may have to use "/" when configuring  "log.dirs".
>
> Thanks,
>
> Jun
>
>
> On Mon, Jul 29, 2013 at 4:50 PM, Maxime Petazzoni <
> maxime.petazz...@turn.com
> > wrote:
>
> > Same issue with the 0.8 beta1 tarball. There is something interesting in
> > state-change.log though:
> >
> > [2013-07-29 16:47:26,708] TRACE Broker 0 received LeaderAndIsr request
> > correlationId 6 from controller 0 epoch 1 starting the become-leader
> > transition for partition [test,0] (state.change.logger)
> > [2013-07-29 16:47:26,736] ERROR Error on broker 0 while processing
> > LeaderAndIsr request correlationId 6 received from controller 0 epoch 1
> for
> > partition (test,0) (state.change.logger)
> > java.util.NoSuchElementException: key not found:
> > /home/maxime/opt/kafka/data/kafka
> >         at scala.collection.MapLike$class.default(MapLike.scala:223)
> >         at scala.collection.immutable.Map$Map1.default(Map.scala:93)
> >         at scala.collection.MapLike$class.apply(MapLike.scala:134)
> >         at scala.collection.immutable.Map$Map1.apply(Map.scala:93)
> >         at kafka.cluster.Partition.getOrCreateReplica(Partition.scala:83)
> >         ...
> >
> > I have log.dir=/home/maxime/opt/kafka/data/kafka/ in server.properties.
> > That directory obviously exists after Kafka starts, and contains:
> >
> > find /home/maxime/opt/kafka/data/kafka/
> > /home/maxime/opt/kafka/data/kafka/
> > /home/maxime/opt/kafka/data/kafka/test-0
> > /home/maxime/opt/kafka/data/kafka/test-0/00000000000000000000.log
> > /home/maxime/opt/kafka/data/kafka/test-0/00000000000000000000.index
> > /home/maxime/opt/kafka/data/kafka/.lock
> >
> > Which is expected, given I have a single 'test' topic with a single
> > partition.
> >
> > Any ideas? Can you reproduce the problem on your end with a freshly
> > extracted tarball?
> >
> > Thanks,
> > /Max
> > --
> > Maxime Petazzoni
> > Sr. Platform Engineer
> > m 408.310.0595
> > www.turn.com
> >
> > ________________________________________
> > From: Jun Rao [jun...@gmail.com]
> > Sent: Sunday, July 21, 2013 9:38 PM
> > To: users@kafka.apache.org
> > Subject: Re: Leader not local for partition error?
> >
> > Any error/exception in state-change or controller log? Also, could you
> try
> > the 0.8 beta1 release?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Jul 15, 2013 at 1:36 PM, Maxime Petazzoni <
> > maxime.petazz...@turn.com
> > > wrote:
> >
> > > Hi all,
> > >
> > > I'm not sure if I'm doing something wrong or if I missed a step
> > > somewhere. A little while ago I successfully got the 0.8 quickstart
> > > example to work fine with the console producer/consumer. Then I went to
> > > work on some code to learn how to implement a producer, which failed
> > > with the producer not being able to send anything with the following
> > > error in the logs:
> > >
> > >   Produce request with correlation id 11 failed due to [test,0]:
> > > kafka.common.NotLeaderForPartitionException
> > >
> > > So I went back to trying the console producer, and I'm getting the same
> > > error. To be sure, I removed all generated data by ZooKeeper and Kafka
> > > and re-followed the steps of the quickstart guide, but I'm getting the
> > > same error with the console producer/consumer.
> > >
> > > kafka-list-topic.sh correctly lists my 1-partition, 1-replica test
> > > topic:
> > >
> > >   topic: test partition: 0  leader: 0 replicas: 0 isr: 0
> > >
> > > ZK and the broker are of course both up and running. Starting the
> > > producer nothing out of the ordinary happens, but when starting the
> > > consumer (before attempting to send anything), I get the following
> > > exception:
> > >
> > >   [2013-07-15 13:25:46,487] INFO
> > >
> >
> [ConsumerFetcherThread-console-consumer-943_polygon-1373919946074-f478ba53-0-0],
> > > Starting  (kafka.consumer.ConsumerFetcherThread)
> > >   [2013-07-15 13:25:46,517] ERROR
> > >
> >
> [console-consumer-943_polygon-1373919946074-f478ba53-leader-finder-thread],
> > > Error due to
>  (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
> > >   kafka.common.NotLeaderForPartitionException
> > >     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
> > > Method)
> > >
> > > I'm at a loss at what's going on here. When the broker starts it
> clearly
> > > goes through the election process and becomes the leader (since it's
> the
> > > only broker anyway) for the 'test' topic:
> > >
> > >   [2013-07-15 13:33:05,345] INFO 0 successfully elected as leader
> > > (kafka.server.ZookeeperLeaderElector)
> > >   [2013-07-15 13:33:05,550] INFO [Replica Manager on Broker 0]:
> Handling
> > > LeaderAndIsr request
> > >
> >
> Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:3;CorrelationId:0;ClientId:id_0-host_null-port_9092;PartitionState:(test,0)
> > > ->
> > >
> >
> (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0);Leaders:id:0,host:
> > > polygon.turn.com,port:9092 (kafka.server.ReplicaManager)
> > >   [2013-07-15 13:33:05,551] INFO New leader is 0
> > > (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
> > >   [2013-07-15 13:33:05,563] INFO [Kafka Server 0], Started
> > > (kafka.server.KafkaServer)
> > >   [2013-07-15 13:33:05,563] INFO [ReplicaFetcherManager on broker 0]
> > > Removing fetcher for partition [test,0]
> > (kafka.server.ReplicaFetcherManager)
> > >   [2013-07-15 13:33:05,566] INFO [Replica Manager on Broker 0]: Handled
> > > leader and isr request
> > >
> >
> Name:LeaderAndIsrRequest;Version:0;Controller:0;ControllerEpoch:3;CorrelationId:0;ClientId:id_0-host_null-port_9092;PartitionState:(test,0)
> > > ->
> > >
> >
> (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:1),ReplicationFactor:1),AllReplicas:0);Leaders:id:0,host:
> > > polygon.turn.com,port:9092 (kafka.server.ReplicaManager)
> > >
> > > I'm running Kafka from branch 0.8 (b1891e7). Any idea what's going on
> > > there?
> > >
> > > Thanks,
> > > /Max
> > >
> > > --
> > > Maxime Petazzoni
> > > Sr. Platform Engineer
> > > m 408.310.0595
> > > www.turn.com
> > >
> >
>

Reply via email to