Did you run the --verify option ( http://kafka.apache.org/documentation.html#basic_ops_restarting) to check if the reassignment process completes? Also, what version of Kafka are you using?
Thanks, Jun On Mon, Dec 1, 2014 at 7:16 PM, Andrew Jorgensen < ajorgen...@twitter.com.invalid> wrote: > I unfortunately do not have any specific logs from these events but I will > try and describe the events as accurately as possible to give an idea of > the problem I saw. > > The odd behavior manifested itself when I bounced all of the kafka > processes on each of the servers in a 12 node cluster. A few weeks prior I > did a partition reassignment to add four new kafka brokers to the cluster. > This cluster has 4 topics on it each with 350 partitions each, a retention > policy of 6 hours, and a replication factor of 1. Originally I attempted to > run a migration on all of the topics and partitions adding the 4 new nodes > using the partition reassignment tool. This seemed to cause a lot of > network congestion and according to the logs some of the nodes were having > trouble talking to each other. The network congestion lasted for the > duration of the migration and began to get better toward the end. After the > migration I confirmed that data was being stored and served from the new > brokers. Today I bounced each of the kafka processes on each of the brokers > to pick up a change made to the log4j properties. After bouncing one > processes I started seeing some strange errors on the four newer broker > nodes that looked like: > > kafka.common.NotAssignedReplicaException: Leader 10 failed to record > follower 7's position 0 for partition [topic-1,185] since the replica 7 is > not recognized to be one of the assigned replicas 10 for partition > [topic-2,185] > > and on the older kafka brokers the errors looked like: > > [2014-12-01 17:06:04,268] ERROR [ReplicaFetcherThread-0-12], Error for > partition [topic-1,175] to broker 12:class kafka.common.UnknownException > (kafka.server.ReplicaFetcherThread) > > I proceeded to bounce the rest of the kafka processes and after bouncing > the rest the errors seemed to stop. It wasn’t until a few hours later I > noticed that the amount of data stored on the 4 new kafka brokers had > dropped off significantly. When I ran a describe for the topics in the > errors it was clear that the assigned partitions had been reverted to a > state prior to the original migration to add the 4 new brokers. I am unsure > of why bouncing the kafka process would cause the state in zookeeper to get > overwritten given that it had seemed to have been working for the last few > weeks until the process was restarted. My hunch is that the controller > keeps some state about the world pre-reassignment and removes that state > after it detects that the reassignment happened successfully. In this case > the network congestion on each of the brokers caused the controller not to > get notified when all the reassignments were completed and thus kept the > pre-assignement state around. When the process was bounced it read from > zookeeper to get this state and reverted the existing scheme to the > pre-assignment state. Has this behavior been observed before? Does this > sound like a logical understanding of what happened in this case? > > -- > Andrew Jorgensen > @ajorgensen