[
https://issues.apache.org/jira/browse/HDDS-12924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Kumar resolved HDDS-12924.
---------------------------------
Fix Version/s: 2.1.0
Resolution: Fixed
> Optimize DU for used space calculation
> --------------------------------------
>
> Key: HDDS-12924
> URL: https://issues.apache.org/jira/browse/HDDS-12924
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: Ozone Datanode
> Reporter: Sumit Agrawal
> Assignee: Sumit Agrawal
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.1.0
>
>
> h3. DU takes time when Running over disk having 10s TB of container data.
> This will be slow operation.
>
> *Sultion 1:* Run DU over non-ozone path
> Challenges:
> # Root path is not known for the device mounted on. So this needs extra
> configuration.
> # Permission problem: du will not count the space if permission is not there
> on a certain path.
>
> Based on the above concern, it's *not feasible* to do du over non-ozone path.
>
> *Solution 2:* Run DU over meta path only (excluding container dir path)
> Ozone space usages includes Sum of:
> # Sum of all Container data size as present in memory
> # DU over volume path as current (excluding container path)
>
> *Space not counted:*
> * Ozone used size for duplicate containers are not counted (ignored during
> startup for EC case)
> * Ozone used size for containers corrupted are not counted, not deleted
> * Container path meta files like container yaml, exported rocks db’s sst
> files, are not counted, which might result in few GB of data in a large
> cluster.
> These spaces will be added up to +non-ozone+ used space, and especially
> un-accounted containers need to be cleaned up.
>
> *Impact of inaccuracy of Used Space:*
> * Used space in reporting (may be few GB as per above)
> * Adjustment between Ozone available space and Rerved available space
> This inaccuracy does not have much impact over solution (as its present in
> existing) and due to nature of “du” running async and parallel write
> operation being in progress.
>
> As part of this, can provide another strategy – “{*}OptimizedDU{*}” and
> keeping old DU.
>
> This solution seems to {*}provide a better approach{*}, as du is getting run
> over smaller metadata size, and other used container data is computed from
> memory.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]