jojochuang commented on code in PR #8496:
URL: https://github.com/apache/ozone/pull/8496#discussion_r2116540843


##########
hadoop-hdds/docs/content/feature/Snapshot.md:
##########
@@ -25,53 +25,270 @@ summary: Ozone Snapshot
 
 ## Introduction
 
-Snapshot feature for Apache Ozone object store allows users to take 
point-in-time consistent image of a given bucket. Snapshot feature enables 
various use cases, including:
- * Backup and Restore: Create hourly, daily, weekly, monthly snapshots for 
backup and recovery when needed.
- * Archival and Compliance: Take snapshots for compliance purpose and archive 
them as required.
- * Replication and Disaster Recovery (DR): Snapshots provide frozen immutable 
images of the bucket on the source Ozone cluster. Snapshots can be used for 
replicating these immutable bucket images to remote DR sites.
- * Incremental Replication: DistCp with SnapshotDiff offers an efficient way 
to incrementally sync up source and destination buckets.
+Snapshot feature for Apache Ozone object store allows users to take a 
point-in-time consistent image of a given bucket. The snapshot is a read-only, 
frozen image of the bucket’s state at the time of snapshot creation. Snapshot 
feature enables various use cases, including:
 
-## Snapshot APIs
+* **Backup and Restore** – Create hourly, daily, weekly, monthly snapshots for 
backup and recovery when needed.
+* **Archival and Compliance** – Take snapshots for compliance purposes and 
archive them as required.
+* **Replication and Disaster Recovery (DR)** – Snapshots provide frozen, 
immutable images of the bucket on the source Ozone cluster. These can be used 
for replicating bucket images to remote DR sites.
+* **Incremental Replication** – DistCp with SnapshotDiff offers an efficient 
way to incrementally sync up source and destination buckets.
 
-Snapshot feature is available through 'ozone fs' and 'ozone sh' CLI. This 
feature can also be programmatically accessed from Ozone `ObjectStore` Java 
client. The feature provides following functionalities:
-* Create Snapshot: Create an instantaneous snapshot for a given bucket
+## Architecture
+
+Ozone Snapshot architecture leverages the immutability of data blocks in 
Ozone. Data blocks, once written, remain immutable for their lifetime and are 
only reclaimed when the corresponding key metadata is removed from the 
namespace. All Ozone metadata (volume, bucket, keys, directories) is stored in 
the Ozone Manager (OM) metadata store (RocksDB). When a user takes a snapshot 
of a bucket, the system internally creates a point-in-time copy of the bucket’s 
namespace metadata on the OM. Since Ozone doesn’t allow in-place updates to 
DataNode blocks, the integrity of data referenced by the snapshot is preserved. 
The OM’s key deletion service is aware of snapshots: it will not permanently 
delete any key as long as that key is still referenced by the active bucket or 
any existing snapshot. A background KeyDeletingService and DirectoryDeleting 
Service (garbage collectors) identify keys that are no longer referenced by any 
snapshot or the live bucket, and reclaim those blocks.
+
+Ozone also provides a SnapshotDiff feature. When a user issues a SnapshotDiff 
between two snapshots, the OM efficiently computes all the differences (added, 
deleted, modified, or renamed keys) between the two snapshots and returns a 
paginated list of changes. Snapshot diff results are cached to speed up 
subsequent requests for the same snapshot pair.
+
+## System Architecture Deep Dive
+
+Internally, Ozone implements snapshots by **versioning the OM metadata for 
each bucket** snapshot. The OM maintains a snapshot metadata table that records 
the state of the bucket’s key directory tree at the moment of snapshot 
creation. No data is physically copied at snapshot creation – the operation 
simply marks a consistent snapshot of the OM’s RocksDB state (hence snapshots 
are created instantaneously). Under the hood, Ozone relies on RocksDB’s 
abilities (like checkpoint) to preserve point-in-time views of the metadata. 
Each snapshot is identified by a unique ID and name, and each key entry in the 
OM DB carries information about which snapshots (if any) it belongs to. This 
approach ensures that **common data is not duplicated** across snapshots: if a 
key has not changed between two snapshots, both snapshots reference the same 
underlying data blocks.

Review Comment:
   Currently we do not allow compaction on snapshots.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to