Re: Upgrading from 0.8.0 to 0.8.1 one broker at a time issues

Jun Rao Wed, 09 Apr 2014 20:39:08 -0700

Was there any error in the controller and the state-change logs?

Thanks,


Jun


On Wed, Apr 9, 2014 at 11:18 AM, Marcin Michalski <mmichal...@tagged.com>wrote:

> Hi, has anyone upgraded their kafka from 0.8.0 to 0.8.1 successfully one
> broker at a time on a live cluster?
>
> I am seeing strange behaviors where many of my kafka topics become unusable
> (by both consumers and producers). When that happens, I see lots of errors
> in the server logs that look like this:
>
> [2014-04-09 10:38:14,669] WARN [KafkaApi-1007] Fetch request with
> correlation id 2455 from client ReplicaFetcherThread-15-1007 on partition
> [risk,0] failed due to Topic risk either doesn't exist or is in the process
> of being deleted (kafka.server.KafkaApis)
> [2014-04-09 10:38:14,669] WARN [KafkaApi-1007] Fetch request with
> correlation id 2455 from client ReplicaFetcherThread-7-1007 on partition
> [message,0] failed due to Topic message either doesn't exist or is in the
> process of being deleted (kafka.server.KafkaApis)
>
> When I try to consume a message from a topic that complained about the
> Topic not existing (above warning), I get the below exception:
>
> ....topic message --from-beginning
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> [2014-04-09 10:40:30,571] WARN
>
> [console-consumer-90716_dkafkadatahub07.tag-dev.com-1397065229615-7211ba72-leader-finder-thread],
> Failed to add leader for partitions [message,0]; will retry
> (kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
> kafka.common.UnknownTopicOrPartitionException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at java.lang.Class.newInstance0(Class.java:355)
> at java.lang.Class.newInstance(Class.java:308)
> at kafka.common.ErrorMapping$.exceptionFor(ErrorMapping.scala:79)
> at
>
> kafka.consumer.SimpleConsumer.earliestOrLatestOffset(SimpleConsumer.scala:167)
> at
>
> kafka.consumer.ConsumerFetcherThread.handleOffsetOutOfRange(ConsumerFetcherThread.scala:60)
> at
>
> kafka.server.AbstractFetcherThread$$anonfun$addPartitions$2.apply(AbstractFetcherThread.scala:179)
> at
>
> kafka.server.AbstractFetcherThread$$anonfun$addPartitions$2.apply(AbstractFetcherThread.scala:174)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
> at
>
> kafka.server.AbstractFetcherThread.addPartitions(AbstractFetcherThread.scala:174)
> at
>
> kafka.server.AbstractFetcherManager$$anonfun$addFetcherForPartitions$2.apply(AbstractFetcherManager.scala:86)
> at
>
> kafka.server.AbstractFetcherManager$$anonfun$addFetcherForPartitions$2.apply(AbstractFetcherManager.scala:76)
> at scala.collection.immutable.Map$Map1.foreach(Map.scala:119)
> at
>
> kafka.server.AbstractFetcherManager.addFetcherForPartitions(AbstractFetcherManager.scala:76)
> at
>
> kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:95)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51)
> ----------
>
> *More details about my issues:*
> My current configuration in the environment where I am testing the upgrade
> is 4 physical servers running 2 brokers each with controlled shutdown
> feature enabled. When I shutdown the 2 brokers on one of the existing Kafka
> 0.8.0 machines and upgrade that machine to 0.8.1 and restart it, all is
> fine for a bit. Once, the new brokers come up, I ran the
> kafka-preferred-replica-election.sh to make sure that started brokers
> become leaders of existing topics.  The replication factor on the topics is
> set to 4. I tested both producing and consuming messages against brokers
> that were leaders with kafka 0.8.0 and 0.8.1 and no issues were
> encountered.
>
> Later, I tried to perform the control shutdown of the 2 additional brokers
> on the Kafka server that has 0.8.0 version installed and after the broker
> shutdown and new leaders were assigned, all of my server logs are getting
> filled up with the above exceptions and most of my topics are not usable. I
> have pulled and build the 0.8.1 kafka code from git last thursday so I
> should be pretty much up to date. So not sure if I am doing something wrong
> or if migrating from 0.8.0 to 0.8.1 on a live cluster one server at a time
> is not supported. Is there a recommended migration approach that one should
> take when migrating from live 0.8.0 to 0.8.1 cluster?
>
> As to who is the leader of one of the topics that became unusable is the
> broker that was successfully upgraded to 0.8.1:
> Topic:message   PartitionCount:1        ReplicationFactor:4     Configs:
>         Topic: message  Partition: 0   * Leader: 1007 *   Replicas:
> 1007,8,9,1001 Isr: 1001,1007,8
>
> Brokers 9 and 1009 where shutdown from one physical server that had kafka
> 0.8.0 installed when these problems started occurring (I was planning to
> upgrade them to 0.8.1). The only way I can recover from this state is to
> shutdown all brokers and delete all of kafka topic logs plus zookeeper
> kafka directory and start with new cluster.
>
>
> Your help in this matter is greatly appreciated.
>
> Thanks,
> Martin
>

Re: Upgrading from 0.8.0 to 0.8.1 one broker at a time issues

Reply via email to