Noa Resare created KAFKA-10314:
----------------------------------

             Summary: KafkaStorageException on reassignment when offline log 
directories exist
                 Key: KAFKA-10314
                 URL: https://issues.apache.org/jira/browse/KAFKA-10314
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 2.5.0
            Reporter: Noa Resare


If a reassignment of a partition is triggered to a broker with an offline 
directory, the new broker will fail to follow, instead raising a 
KafkaStorageException which causes the reassignment to stall indefinitely. The 
error message we see is the following:

{{[2020-07-23 13:11:08,727] ERROR [Broker id=1] Skipped the become-follower 
state change with correlation id 14 from controller 1 epoch 1 for partition 
t2-0 (last update controller epoch 1) with leader 2 since the replica for the 
partition is offline due to disk error 
org.apache.kafka.common.errors.KafkaStorageException: Can not create log for 
t2-0 because log directories /tmp/kafka/d1 are offline (state.change.logger)}}

It seems to me that unless the partition in question already existed on the 
offline log partition, a better behaviour would simply be to assign the 
partition to one of the available log directories.

The conditional in 
[LogManager.scala:769|https://github.com/apache/kafka/blob/11f75691b87fcecc8b29bfd25c7067e054e408ea/core/src/main/scala/kafka/log/LogManager.scala#L769]
 was introduced to prevent the issue in 
[KAFKA-4763|https://issues.apache.org/jira/browse/KAFKA-4763] where partitions 
in offline logdirs would be re-created in an online directory as soon as a 
LeaderAndISR message gets processed. However, the semantics of isNew seems 
different in LogManager (the replica is new on this broker) compared to when 
isNew is set in 
[KafkaController.scala|https://github.com/apache/kafka/blob/11f75691b87fcecc8b29bfd25c7067e054e408ea/core/src/main/scala/kafka/controller/KafkaController.scala#L879]
 (where it seems to refer to whether the topic partition in itself is new, all 
followers gets {{isNew=false}})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to