[ https://issues.apache.org/jira/browse/KAFKA-16234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815403#comment-17815403 ]
Gaurav Narula commented on KAFKA-16234: --------------------------------------- Perhaps a way to solve this would be to determine if a log is a stray replica at the time we load it and not after all logs have been loaded. > Log directory failure re-creates partitions in another logdir automatically > --------------------------------------------------------------------------- > > Key: KAFKA-16234 > URL: https://issues.apache.org/jira/browse/KAFKA-16234 > Project: Kafka > Issue Type: Bug > Components: jbod > Affects Versions: 3.7.0 > Reporter: Gaurav Narula > Assignee: Omnia Ibrahim > Priority: Major > > With [KAFKA-16157|https://github.com/apache/kafka/pull/15263] we made changes > in {{HostedPartition.Offline}} enum variant to embed {{Partition}} object. > Further, {{ReplicaManager::getOrCreatePartition}} tries to compare the old > and new topicIds to decide if it needs to create a new log. > The getter for {{Partition::topicId}} relies on retrieving the topicId from > {{log}} field or {{{}logManager.currentLogs{}}}. The former is set to > {{None}} when a partition is marked offline and the key for the partition is > removed from the latter by {{{}LogManager::handleLogDirFailure{}}}. > Therefore, topicId for a partitioned marked offline always returns {{None}} > and new logs for all partitions in a failed log directory are always created > on another disk. > The broker will fail to restart after the failed disk is repaired because > same partitions will occur in two different directories. The error does > however inform the operator to remove the partitions from the disk that > failed which should help with broker startup. > We can avoid this with KAFKA-16212 but in the short-term, an immediate > solution can be to have {{Partition}} object accept {{Option[TopicId]}} in > it's constructor and have it fallback to {{log}} or {{logManager}} if it's > unset. -- This message was sent by Atlassian Jira (v8.20.10#820010)