showuon opened a new pull request, #12136:
URL: https://github.com/apache/kafka/pull/12136

   [jira](https://issues.apache.org/jira/browse/KAFKA-13773) 
   
   When logManager startup and loadLogs, we expect to catch any `IOException` 
(ex: out of space error) and turn the log dir into offline. Later, we'll handle 
the offline logDir in `ReplicaManager`, so that the `cleanShutdown` file won't 
be created. The reason why the broker shutdown with cleanShutdown file after 
full disk is because during loadLogs and do log recovery, we'll write 
`leader-epoch-checkpoint` file 
[here](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/log/LogLoader.scala#L184).
 And if any IOException thrown, we'll wrap it as `KafkaStorageException` and 
rethrow 
[here](https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/checkpoints/CheckpointFileWithFailureHandler.scala#L37-L42).
   
   This PR is to fix the issue by catching the `KafkaStorageException` with 
`IOException` cause exceptions during loadLogs, and mark the logDir as offline 
to let the `ReplicaManager` handle the offline logDirs.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to