[ https://issues.apache.org/jira/browse/HDDS-12924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HDDS-12924: ---------------------------------- Labels: pull-request-available (was: ) > Optimize DU for used space calculation > -------------------------------------- > > Key: HDDS-12924 > URL: https://issues.apache.org/jira/browse/HDDS-12924 > Project: Apache Ozone > Issue Type: Sub-task > Components: Ozone Datanode > Reporter: Sumit Agrawal > Assignee: Sumit Agrawal > Priority: Major > Labels: pull-request-available > > h3. DU takes time when Running over disk having 10s TB of container data. > This will be slow operation. > > *Sultion 1:* Run DU over non-ozone path > Challenges: > # Root path is not known for the device mounted on. So this needs extra > configuration. > # Permission problem: du will not count the space if permission is not there > on a certain path. > > Based on the above concern, it's *not feasible* to do du over non-ozone path. > > *Solution 2:* Run DU over meta path only (excluding container dir path) > Ozone space usages includes Sum of: > # Sum of all Container data size as present in memory > # DU over volume path as current (excluding container path) > > *Space not counted:* > * Ozone used size for duplicate containers are not counted (ignored during > startup for EC case) > * Ozone used size for containers corrupted are not counted, not deleted > * Container path meta files like container yaml, exported rocks db’s sst > files, are not counted, which might result in few GB of data in a large > cluster. > These spaces will be added up to +non-ozone+ used space, and especially > un-accounted containers need to be cleaned up. > > *Impact of inaccuracy of Used Space:* > * Used space in reporting (may be few GB as per above) > * Adjustment between Ozone available space and Rerved available space > This inaccuracy does not have much impact over solution (as its present in > existing) and due to nature of “du” running async and parallel write > operation being in progress. > > As part of this, can provide another strategy – “{*}OptimizedDU{*}” and > keeping old DU. > > This solution seems to {*}provide a better approach{*}, as du is getting run > over smaller metadata size, and other used container data is computed from > memory. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org