Hi Ron, Thank you for having a look a this KIP.
Indeed, the log directory UUID should always be generated and loaded. I've have corrected the wording in the KIP to clarify. It is a bit of a pain to replace the field, but I agree that is the best approach for the same reason you pointed out. I have updated the log.dir.failure.timeout.ms config documentation to make it clear that it only applies when there are partitions being led from the failed directory. Your understanding is correct regarding the snapshot result after the logical update when a broker transitions to multiple log directories. I have updated the KIP to clarify that. > I wonder about the corner case where a broker that previously > had multiple log dirs is restarted with a new config that specifies > just a single log directory. What would happen here? If the broker > were not the leader then perhaps it would replicate the data into the > single log directory. What would happen if it were the leader of a > partition that had been marked as offline? Would we have data loss > even if other replicas still had data? There would be no data loss. After the configuration change, the broker would register indicating a single log directory and OfflineLogDirs==false. This indicates to the controller that any replicas in this broker that referenced a different (and non null / default) log directory, require a leadership update, that would prevent this broker from become a leader for those partitions. Those partitions are then created by the broker into the single configured log directory, and streamed from the new leaders. Does this make sense? Thanks, -- Igor