[
https://issues.apache.org/jira/browse/HDDS-13599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016921#comment-18016921
]
Sammi Chen edited comment on HDDS-13599 at 8/29/25 4:12 AM:
------------------------------------------------------------
[~szetszwo], Thread C will get the old path, because the "Container container"
its hold get staled after new replica is added to DN in memory ContainerSet,
the "Container container" its hold always points to the old path.
After container replica is copied to destination volume, it call the
KeyValueHandler.importContainer() to import the container, and this
KeyValueHandler.importContainer() returns a new "Container container" which
points to the new path. And new "Container container" is added to DN in memory
ContainerSet to replace the old "Container container" which is already held by
Thread C. Given that the concurrency of DN read, there could be hundreds of
such Thread C there. That's why locking file resolver doesn't work.
Every chunk reader thread holds the read lock of KeyValueContainer, and replica
deletion thread hold the write lock of KeyValueContainer can help. But this
case, replica ONE container is being moved by disk balancer is not a common
case, first disk balancer is by default disabled, and real user rarely use
replica ONE data in production cluster. It's kind of not worth to make the
chunk read thread acquire lock of KeyValueContainer for such a minority case,
not to mention the performance impact.
Since it's a case for disk balancer, we can solve it with disk balancer's way,
that's the HDDS-13602, we delay the deletion of old replica, to make sure all
chunk reader threads, which hold the old "Container container", can finish
their reading from old replica chunk file. Thoughts?
was (Author: sammi):
[~szetszwo], Thread C will get the old path, because the "Container container"
its hold get staled after new replica is added to DN in memory ContainerSet,
the "Container container" its hold always points to the old path.
After container replica is copied to destination volume, it call the
KeyValueHandler.importContainer() to import the container, and this
KeyValueHandler.importContainer() returns a new "Container container" which
points to the new path. And new "Container container" is added to DN in memory
ContainerSet to replace the old "Container container" which is already held by
Thread C. Given that the concurrency of DN read, there could be hundreds of
such Thread C there. That's why locking file resolver doesn't work.
Every chunk reader thread holds the read lock of KeyValueContainer, and replica
deletion thread hold the write lock of KeyValueContainer can help. But this
case, replica ONE container is being moved by disk balancer is not a common
case, first disk balancer is by default disabled, and real user rarely use
replica ONE data in production cluster. It's kind of not worth to make the
chunk read thread acquire lock of KeyValueContainer for such a minority case,
not to mention the performance impact.
Since it's case for disk balancer, we can solve it with disk balancer's way,
that's the HDDS-13602, we delay the deletion of old replica, to make sure all
chunk reader threads, which hold the old "Container container", can finish
their reading from old replica chunk file. Thoughts?
> Take write Lock of all block files before a container replica directory is
> deleted
> ----------------------------------------------------------------------------------
>
> Key: HDDS-13599
> URL: https://issues.apache.org/jira/browse/HDDS-13599
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Sammi Chen
> Priority: Major
> Attachments: screenshot-1.png
>
>
> To avoid interim read failure caused by block file deleted during container
> replica directory deletion.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]