Gaurav Narula created KAFKA-19458:
-------------------------------------
Summary: Successive AlterReplicaLogDirsRequest on a topic
partition may leak log segments
Key: KAFKA-19458
URL: https://issues.apache.org/jira/browse/KAFKA-19458
Project: Kafka
Issue Type: Bug
Affects Versions: 4.0.0, 3.9.1, 4.1.0
Reporter: Gaurav Narula
Successive {{AlterReplicaLogDirsRequest}} to change log directory of a given
topic partition may cause log segment leak. Consider the following scenario:
1. A request tries to change the logdir for topic partition {{tp}} from {{d1}}
to {{d2}}.
2. The handler invokes {{replicaManager#alterReplicaLogDirs}}
3. A future replica is created as a result of the above method invoking
{{partition#maybeCreateFutureReplica}} and cleaning for {{tp}} is disabled as
{{logManager#abortAndPauseCleaning}} is invoked.
4. Now, *before* the previous request is completed, let's assume another
request to change the logdir from {{d2}} to {{d3}}
5. This time, {{replicaManager#alterReplicaLogDirs}}'s call to
{{partition#futureReplicaDirChanged}} will return {{true}} and we remove the
fetcher and future.
6. We then re-create a future by invoking
{{partition.maybeCreateFutureReplica}} with {{d3}} and pause log cleaning for
{{tp}} *again*.
7. {{partition#maybeReplaceCurrentWithFutureReplica}} is invoked when the
future has caught up and the callback in it swaps the future log for the local
log and resumes cleaning by invoking {{LogManager#resumeCleaning}}.
8. The above decrements the count in {{LogCleaningState.logCleaningPaused}}
from {{2}} to {{1}}. The log segment for the discarded future is therefore
leaked until a broker restart
--
This message was sent by Atlassian Jira
(v8.20.10#820010)