[jira] [Comment Edited] (HDDS-13599) Take write Lock of all block files before a container replica directory is deleted

Sammi Chen (Jira) Wed, 03 Sep 2025 02:05:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18017854#comment-18017854
 ]


Sammi Chen edited comment on HDDS-13599 at 9/3/25 9:04 AM:
-----------------------------------------------------------

[~szetszwo] ,  HDDS-13602 adds code internal to 
[DiskBalancerService.java|https://github.com/apache/ozone/pull/8965/files#diff-6a42145a948e71a42c99cbf5148422b190c40d0293baabc88f9ae3a3f2ae83d3]
 to solve the problem, if any follower up Jira which does the refactor and can 
solve the problem completely, we can either revert HDDS-13602 after that or 
still keep HDDS-13602.  HDDS-13602 doesn't change any read/write IO path code, 
so it should not make the refactoring any harder or easier. 

Except the chunk read, there is chunk write, besides the KeyValueContainer 
lock,  and there are chunk file lock, shall we remove chunk file lock, using 
KeyValueContainer lock only?  Or we still keep the both KeyValueContainer lock 
and chunk file lock?  If chunk read lock lock the  KeyValueContainer lock, 
shall chunk write lock KeyValueContainer too, read lock or write lock?  
Concurrently impact?  The main concern about the performance is the impact 
caused by concurrency impact/degradation, not the extra lock acquire step 
itself.  There are so many details to investigate and think about, so I think 
it's not a small refactor.  Thoughts? 


was (Author: sammi):
[~szetszwo] ,  HDDS-13602 adds code internal to 
[DiskBalancerService.java|https://github.com/apache/ozone/pull/8965/files#diff-6a42145a948e71a42c99cbf5148422b190c40d0293baabc88f9ae3a3f2ae83d3]
 to solve the problem, if any follower up Jira which does the refactor and can 
solve the problem completely, we can revert HDDS-13602 after that.  HDDS-13602 
doesn't change any read/write IO path code, so it should not make the 
refactoring any harder or easier. 

Except the chunk read, there is chunk write, besides the KeyValueContainer 
lock,  and there are chunk file lock, shall we remove chunk file lock, using 
KeyValueContainer lock only?  Or we still keep the both KeyValueContainer lock 
and chunk file lock?  If chunk read lock lock the  KeyValueContainer lock, 
shall chunk write lock KeyValueContainer too, read lock or write lock?  
Concurrently impact?  The main concern about the performance is the impact 
caused by concurrency impact/degradation, not the extra lock acquire step 
itself.  There are so many details to investigate and think about, so I think 
it's not a small refactor.  Thoughts? 

> Take write Lock of all block files before a container replica directory is 
> deleted
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-13599
>                 URL: https://issues.apache.org/jira/browse/HDDS-13599
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Sammi Chen
>            Priority: Major
>         Attachments: screenshot-1.png
>
>
> To avoid interim read failure caused by block file deleted during container 
> replica directory deletion. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDDS-13599) Take write Lock of all block files before a container replica directory is deleted

Reply via email to