[jira] [Updated] (KAFKA-3228) Partition reassignment failure for brokers freshly added to cluster

Andrew Olson (JIRA) Mon, 05 Dec 2016 06:08:02 -0800

     [ 
https://issues.apache.org/jira/browse/KAFKA-3228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andrew Olson updated KAFKA-3228:
--------------------------------
    Fix Version/s: 0.10.1.0

> Partition reassignment failure for brokers freshly added to cluster
> -------------------------------------------------------------------
>
>                 Key: KAFKA-3228
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3228
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.2.1
>            Reporter: Andrew Olson
>            Assignee: Neha Narkhede
>             Fix For: 0.10.1.0
>
>
> After adding about new 20 brokers to double the size of an existing 
> production Kafka deployment, when attempting to rebalance partitions we were 
> initially unable to reassign any partitions to 5 of the 20. There was no 
> problem with the other 15. The controller broker logged error messages like:
> {noformat}
> ERROR kafka.controller.KafkaController: [Controller 19]: Error completing 
> reassignment of partition [TOPIC-NAME,2]
> kafka.common.KafkaException: Only 4,33 replicas out of the new set of 
> replicas 4,34,33 for partition [TOPIC-NAME,2]
> to be reassigned are alive. Failing partition reassignment
>       at 
> kafka.controller.KafkaController.initiateReassignReplicasForTopicPartition(KafkaController.scala:611)
>       at 
> kafka.controller.PartitionsReassignedListener$$anonfun$handleDataChange$4$$anonfun$apply$6.apply$mcV$sp(KafkaController.scala:1203)
>       at 
> kafka.controller.PartitionsReassignedListener$$anonfun$handleDataChange$4$$anonfun$apply$6.apply(KafkaController.scala:1197)
>       at 
> kafka.controller.PartitionsReassignedListener$$anonfun$handleDataChange$4$$anonfun$apply$6.apply(KafkaController.scala:1197)
>       at kafka.utils.Utils$.inLock(Utils.scala:535)
>       at 
> kafka.controller.PartitionsReassignedListener$$anonfun$handleDataChange$4.apply(KafkaController.scala:1196)
>       at 
> kafka.controller.PartitionsReassignedListener$$anonfun$handleDataChange$4.apply(KafkaController.scala:1195)
>       at 
> scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:224)
>       at 
> scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:403)
>       at 
> kafka.controller.PartitionsReassignedListener.handleDataChange(KafkaController.scala:1195)
>       at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:751)
>       at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> {noformat}
> We reattempted the reassignment to one of these new brokers, with the same 
> result.
> We also saw these messages in the controller's log. There was a "Broken pipe" 
> error for each of the new brokers.
> {noformat}
> 2016-02-09 12:13:22,082 WARN kafka.controller.RequestSendThread: 
> [Controller-19-to-broker-34-send-thread],
> Controller 19 epoch 28 fails to send request Name:UpdateMetadataRequest...
> java.io.IOException: Broken pipe
>       at sun.nio.ch.FileDispatcherImpl.writev0(Native Method)
>       at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51)
>       at sun.nio.ch.IOUtil.write(IOUtil.java:148)
>       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504)
>       at java.nio.channels.SocketChannel.write(SocketChannel.java:502)
>       at 
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56)
>       at kafka.network.Send$class.writeCompletely(Transmission.scala:75)
>       at 
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>       at kafka.network.BlockingChannel.send(BlockingChannel.scala:103)
>       at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
>       at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>       at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> {noformat}
> WARN kafka.controller.RequestSendThread: 
> [Controller-19-to-broker-34-send-thread],
> Controller 19 epoch 28 fails to send request Name:UpdateMetadataRequest... to 
> broker id:34...
> Reconnecting to broker.
> java.io.EOFException: Received -1 when reading from channel, socket has 
> likely been closed.
>       at kafka.utils.Utils$.read(Utils.scala:381)
>       at 
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>       at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>       at 
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>       at kafka.network.BlockingChannel.receive(BlockingChannel.scala:111)
>       at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:133)
>       at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
>       at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> {noformat}
> {noformat}
> INFO kafka.controller.RequestSendThread: 
> [Controller-19-to-broker-34-send-thread], Controller 19 connected
> to id:34... for sending state change requests
> {noformat}
> There were no error messages in the new broker log files, just the normal 
> startup logs. A jstack did not reveal anything unusual with the threads, and 
> using netstat the network connections looked normal.
> We're running version 0.8.2.1. The new brokers were simultaneously started  
> using a broadcast-style command. However we also had the same issue with a 
> different Kafka cluster after starting up the new brokers individually about 
> 30 seconds apart.
> After stopping and restarting the 5 problematic new brokers, the reassignment 
> was then successful, and they are now functioning normally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (KAFKA-3228) Partition reassignment failure for brokers freshly added to cluster

Reply via email to