xichen01 opened a new pull request, #6326: URL: https://github.com/apache/ozone/pull/6326
## What changes were proposed in this pull request? The old PR https://github.com/apache/ozone/pull/6270 has been reverted due to a bug. This PR fixed the bug and recreated the PR. ### root cause The root cause of the bug in the old PR is that the future put into the `executePutBlock` of `CommitWatcher#futureMap` may be overwritten by the last closed `executePutBlock`. Therefore, the buffer cannot be released correctly. ### Fix Change the `CommitWatcher#futureMap` from `ConcurrentMap<Long, CompletableFuture<xxx>> ` to `ConcurrentMap<Long, List<CompletableFuture<xxx>>>` Records and releases all futures with the same key, so all the future can be released. ### A detail log about this bug: ```bash 2024-03-03 18:49:16,741 [Thread-955] ERROR storage.BlockOutputStream (BlockOutputStream.java:executePutBlock(512)) - executePutBlock putFlushFuture flushPos 4194304, flushFuture java.util.concurrent.CompletableFuture@8bc6bca[Not completed], close false, force false 2024-03-03 18:49:16,741 [Thread-955] ERROR storage.BlockOutputStream (RatisBlockOutputStream.java:putFlushFuture(120)) - putFlushFuture flushPos 4194304 flushFuture java.util.concurrent.CompletableFuture@8bc6bca[Not completed] ``` In the next log, we can see the entry **{4194304, CompletableFuture@8bc6bca}** has been overwritten by the **{4194304, CompletableFuture@4230f10c}** ```bash 2024-03-03 18:49:16,741 [Thread-955] ERROR storage.BlockOutputStream (BlockOutputStream.java:executePutBlock(512)) - executePutBlock putFlushFuture flushPos 4194304, flushFuture java.util.concurrent.CompletableFuture@4230f10c[Not completed], close true, force true 2024-03-03 18:49:16,741 [Thread-955] ERROR storage.BlockOutputStream (RatisBlockOutputStream.java:putFlushFuture(120)) - putFlushFuture flushPos 4194304 flushFuture java.util.concurrent.CompletableFuture@4230f10c[Not completed] 2024-03-03 18:49:16,741 [Thread-955] ERROR storage.BlockOutputStream (RatisBlockOutputStream.java:waitOnFlushFutures(127)) - waitOnFlushFutures getFutureMap keySet [4194304] keySet Value [java.util.concurrent.CompletableFuture@4230f10c[Not completed]] ``` ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-10384 ## How was this patch tested? Twice 10x10 tests for TestSecureOzoneRpcClient,TestFreonWithPipelineDestroy,TestOzoneRpcClientWithRatis#ALL all passed. (In order to make all tests pass, this test code includes a fix for the unstable test `testParallelDeleteBucketAndCreateKey` [HDDS-10143](https://issues.apache.org/jira/browse/HDDS-10143)) https://github.com/xichen01/ozone/actions/runs/8138828724/attempts/1 https://github.com/xichen01/ozone/actions/runs/8138828724 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
