[ 
https://issues.apache.org/jira/browse/HDDS-13003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reassigned HDDS-13003:
--------------------------------------

    Assignee: Siyao Meng  (was: Swaminathan Balachandran)

> Snapshot Defragmentation to reduce storage footprint
> ----------------------------------------------------
>
>                 Key: HDDS-13003
>                 URL: https://issues.apache.org/jira/browse/HDDS-13003
>             Project: Apache Ozone
>          Issue Type: New Feature
>          Components: Ozone Manager
>            Reporter: Swaminathan Balachandran
>            Assignee: Siyao Meng
>            Priority: Major
>              Labels: pull-request-available
>
> In Apache Ozone, snapshots currently take a checkpoint of the Active Object 
> Store (AOS) RocksDB each time a snapshot is created and track the compaction 
> of SST files over time. This model works efficiently when snapshots are 
> short-lived, as they merely serve as hard links to the AOS RocksDB. However, 
> over time, if an older snapshot persists while significant churn occurs in 
> the AOS RocksDB (due to compactions and writes), the snapshot RocksDB may 
> diverge significantly from both the AOS RocksDB and other snapshot RocksDB 
> instances. This divergence increases storage requirements linearly with the 
> number of snapshots.
> The primary inefficiency in the current snapshotting mechanism stems from 
> constant RocksDB compactions in AOS, which can cause a key, file, or 
> directory entry to appear in multiple SST files. Ideally, each unique key, 
> file, or directory entry should reside in only one SST file, eliminating 
> redundant storage and mitigating the multiplier effect caused by snapshots. 
> If implemented correctly, the total RocksDB size would be proportional to the 
> total number of unique keys in the system rather than the number of snapshots.
> ----
> Note: *Snapshot Defragmentation* was previously called *Snapshot Compaction* 
> during development and in the design doc. It is renamed because the *Snapshot 
> Compaction* name can be easily confused with *[RocksDB 
> Compaction|https://github.com/facebook/rocksdb/wiki/Compaction]*, which is a 
> different concept.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to