Saketa Chalamchala created HDDS-14719:
-----------------------------------------
Summary: Open HA checkpoint RocksDB as read-only to prevent
unnecessary compaction during txn-info verification
Key: HDDS-14719
URL: https://issues.apache.org/jira/browse/HDDS-14719
Project: Apache Ozone
Issue Type: Bug
Components: OM HA, SCM HA, Snapshot
Reporter: Saketa Chalamchala
Assignee: Saketa Chalamchala
Discovered when testing install checkpoint from leader OM.
When the downloaded checkpoint from the leader is opened to verify if the
checkpoint's txn info > OM db's txn info on the target OM,
RocksDB can run auto-compaction inside the checkpoint (om.db.candidate/om.db)
directory and then delete SSTs, causing the follower’s installed DB file set to
differ from the leader checkpoint and in OM's case the untracked compaction can
cause snapdiff to perform poorly because efficient diff is not possible if the
compaction history is not preserved.
In general, it would be best to open rocksDB in temporary checkpoint locations
in read-only mode to perform read only operations (like txn info validation)
om.db after replacing with installed checkpoint
{code:java}
[/var/lib/hadoop-ozone/om/data520340/om.db]# ls -al
...
-rw-r--r-- 1 hdfs hdfs 177162 Feb 24 15:24 LOG
-rw-r--r-- 1 hdfs hdfs 122650 Feb 24 15:17 LOG.old.1771975134401496
-rw-r--r-- 1 hdfs hdfs 195175 Feb 24 15:23 LOG.old.1771975455672651
[/var/lib/hadoop-ozone/om/data520340/om.db]# vim LOG.old.1771975134401496
...
2026/02/24-15:17:45.977142 140620264883968 [/compaction/compaction_job.cc:1950]
[deletedTable] [JOB 2] Compacting 4@0 + 1@6 files to L6, score 1.00
2026/02/24-15:17:45.977152 140620264883968 [/compaction/compaction_job.cc:1954]
[deletedTable] Compaction start summary: Base version 26 Base level 0, inputs:
[166(1110B) 158(3197B) 151(7595B) 144(3777B)], [130(11KB)]
...
2026/02/24-15:17:45.989272 140620264883968 [le/delete_scheduler.cc:77] Deleted
file
/var/lib/hadoop-ozone/om/ozone-metadata520340/snapshot/om.db.candidate/om.db/000166.sst
immediately, rate_bytes_per_sec 0, total_trash_size 0 max_trash_db_ratio
0.250000
2026/02/24-15:17:45.989295 140620264883968 [le/delete_scheduler.cc:77] Deleted
file
/var/lib/hadoop-ozone/om/ozone-metadata520340/snapshot/om.db.candidate/om.db/000158.sst
immediately, rate_bytes_per_sec 0, total_trash_size 0 max_trash_db_ratio
0.250000
...
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]