Copilot commented on code in PR #8496: URL: https://github.com/apache/ozone/pull/8496#discussion_r2101167396
########## hadoop-hdds/docs/content/feature/Snapshot.md: ########## @@ -23,55 +23,260 @@ summary: Ozone Snapshot limitations under the License. --> -## Introduction +## **Introduction** -Snapshot feature for Apache Ozone object store allows users to take point-in-time consistent image of a given bucket. Snapshot feature enables various use cases, including: - * Backup and Restore: Create hourly, daily, weekly, monthly snapshots for backup and recovery when needed. - * Archival and Compliance: Take snapshots for compliance purpose and archive them as required. - * Replication and Disaster Recovery (DR): Snapshots provide frozen immutable images of the bucket on the source Ozone cluster. Snapshots can be used for replicating these immutable bucket images to remote DR sites. - * Incremental Replication: DistCp with SnapshotDiff offers an efficient way to incrementally sync up source and destination buckets. +Snapshot feature for Apache Ozone object store allows users to take a point-in-time consistent image of a given bucket. The snapshot is a read-only, frozen image of the bucket’s state at the time of creation. Snapshot feature enables various use cases, including: -## Snapshot APIs +* **Backup and Restore** – Create hourly, daily, weekly, monthly snapshots for backup and recovery when needed. +* **Archival and Compliance** – Take snapshots for compliance purposes and archive them as required. +* **Replication and Disaster Recovery (DR)** – Snapshots provide frozen, immutable images of the bucket on the source Ozone cluster. These can be used for replicating bucket images to remote DR sites. +* **Incremental Replication** – DistCp with SnapshotDiff offers an efficient way to incrementally sync up source and destination buckets. -Snapshot feature is available through 'ozone fs' and 'ozone sh' CLI. This feature can also be programmatically accessed from Ozone `ObjectStore` Java client. The feature provides following functionalities: -* Create Snapshot: Create an instantaneous snapshot for a given bucket +## **Architecture** + +Ozone Snapshot architecture leverages the immutability of data blocks in Ozone. Data blocks, once written, remain immutable for their lifetime and are only reclaimed when the corresponding key metadata is removed from the namespace. All Ozone metadata (volume, bucket, keys, directories) is stored in the Ozone Manager (OM) metadata store (RocksDB). When a user takes a snapshot of a bucket, the system internally creates a point-in-time copy of the bucket’s namespace metadata on the OM. Since Ozone doesn’t allow in-place updates to DataNode blocks, the integrity of data referenced by the snapshot is preserved. The OM’s key deletion service is aware of snapshots: it will not permanently delete any key as long as that key is still referenced by the active bucket or any existing snapshot. When snapshots are deleted, a background SnapshotDeletingService and garbage collector identify keys that are no longer referenced by any snapshot or the live bucket, and reclaim those blocks. + +Ozone also provides a SnapshotDiff feature. When a user issues a SnapshotDiff between two snapshots, the OM efficiently computes all the differences (added, deleted, modified, or renamed keys) between the two snapshots and returns a paginated list of changes. Snapshot diff results are cached to speed up subsequent requests for the same snapshot pair. + +## **System Architecture Deep Dive** + +Internally, Ozone implements snapshots by **versioning the OM metadata for each bucket** snapshot. The OM maintains a snapshot metadata table that records the state of the bucket’s key directory tree at the moment of snapshot creation. No data is physically copied at snapshot creation – the operation simply marks a consistent snapshot of the OM’s RocksDB state (hence snapshots are created instantaneously). Under the hood, Ozone relies on RocksDB’s abilities (like snapshot and column family cloning) to preserve point-in-time views of the metadata. Each snapshot is identified by a unique ID and name, and each key entry in the OM DB carries information about which snapshots (if any) it belongs to. This approach ensures that **common data is not duplicated** across snapshots: if a key has not changed between two snapshots, both snapshots reference the same underlying data blocks. + +When keys are modified or deleted in the active bucket, Ozone checks for snapshots: if a snapshot exists that references the old version, the key’s data blocks are retained until the snapshot is deleted. The reference counting in metadata ensures that deleting a snapshot will mark any keys that were exclusively held by that snapshot as reclaimable, triggering block cleanup in the background. + +**SnapshotDiff Implementation:** Ozone computes snapshot diffs efficiently by leveraging RocksDB key range comparisons and a directed acyclic graph (DAG) of compaction history. For a configurable time window after snapshot creation (by default 30 days), OM maintains a *compaction DAG* that allows computing diffs in time proportional to the number of changed keys. If snapshots are relatively recent, OM can determine differences by examining only the metadata changes captured in the DAG (e.g., RocksDB SST files differences) rather than scanning the entire key space. The parameter `ozone.om.snapshot.compaction.dag.max.time.allowed` controls this window (default 30 days). For older snapshots beyond this window (or if the efficient diff data has been compacted away), OM falls back to a full metadata scan to compute the diff. In the worst case, the cost of diff is proportional to iterating over all keys in the bucket (if nearly the entire namespace changed or if using full scan). The Snap shotDiff results (a list of keys with indicators for created (`+`), deleted (`-`), modified (`M`), or renamed (`R`)) are stored in a temporary on-disk cache so that subsequent requests for the same diff can be served quickly without re-computation. + +**Snapshot Data Storage:** Snapshot metadata resides on the OM in the same RocksDB as the live metadata, but separated by snapshot-specific prefixes or tables. The OM persists snapshots such that each snapshot’s metadata can be treated as a read-only view. Additionally, OM stores snapshot-related info such as the mapping of snapshot names to snapshot IDs and the list of snapshot diff jobs. By default, temporary data for snapshot diff computations is stored under the OM metadata directory, but this location can be configured via `ozone.om.snapshot.diff.db.dir` (a dedicated directory for snapshot diff scratch space). + +For a more in-depth discussion of the snapshot design and its evolution, refer to Prashant Pogde’s introduction of Apache Ozone snapshots (the first in a series of blog posts). This Medium post covers the motivation and high-level design of Ozone snapshots, and subsequent posts delve further into the technical implementation. + +## **User Tutorial** + +In this section, we demonstrate how to create and use Ozone snapshots via command-line and programmatically. + +### **Using Snapshots via CLI** + +The Ozone shell provides convenient commands to manage snapshots. Snapshots can be created and manipulated either through the **`ozone sh`** subcommands or the **`ozone fs`** Hadoop-compatible filesystem commands: + +* **Creating a Snapshot:** As shown earlier, use `ozone sh snapshot create <bucket> [snapshotName]` to create a snapshot. For example, to snapshot a bucket named `bucket1` in volume `vol1` with an optional name: ```shell -ozone sh snapshot create [-hV] <bucket> [<snapshotName>] +ozone sh snapshot create /vol1/bucket1 finance_backup_2025 ``` -* List Snapshots: List all snapshots of a given bucket + This captures an instantaneous image of all keys under `/vol1/bucket1`. The operation requires you to be the bucket owner or volume owner (admin privilege). If no snapshot name is provided, Ozone will auto-generate a name (often using a timestamp). Review Comment: [nitpick] Consider clarifying the format or naming convention used when Ozone auto-generates a snapshot name to help users know what to expect. ########## hadoop-hdds/docs/content/feature/Snapshot.md: ########## @@ -23,55 +23,260 @@ summary: Ozone Snapshot limitations under the License. --> -## Introduction +## **Introduction** Review Comment: [nitpick] The heading style is inconsistent; some headings use bold formatting (e.g., '## **Introduction**') while others do not. Consider using a uniform style throughout the document for improved readability. ```suggestion ## Introduction ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
