Saketa Chalamchala created HDDS-14719:
-----------------------------------------

             Summary: Open HA checkpoint RocksDB as read-only to prevent 
unnecessary compaction during txn-info verification
                 Key: HDDS-14719
                 URL: https://issues.apache.org/jira/browse/HDDS-14719
             Project: Apache Ozone
          Issue Type: Bug
          Components: OM HA, SCM HA, Snapshot
            Reporter: Saketa Chalamchala
            Assignee: Saketa Chalamchala


Discovered when testing install checkpoint from leader OM. 
When the downloaded checkpoint from the leader is opened to verify if the 
checkpoint's txn info > OM db's txn info on the target OM,
RocksDB can run auto-compaction inside the checkpoint (om.db.candidate/om.db) 
directory and then delete SSTs, causing the follower’s installed DB file set to 
differ from the leader checkpoint and in OM's case the untracked compaction can 
cause snapdiff to perform poorly because efficient diff is not possible if the 
compaction history is not preserved.

In general, it would be best to open rocksDB in temporary checkpoint locations 
in read-only mode to perform read only operations (like txn info validation)

om.db after replacing with installed checkpoint 
{code:java}
[/var/lib/hadoop-ozone/om/data520340/om.db]# ls -al
...
-rw-r--r-- 1 hdfs hdfs 177162 Feb 24 15:24 LOG
-rw-r--r-- 1 hdfs hdfs 122650 Feb 24 15:17 LOG.old.1771975134401496
-rw-r--r-- 1 hdfs hdfs 195175 Feb 24 15:23 LOG.old.1771975455672651

[/var/lib/hadoop-ozone/om/data520340/om.db]# vim LOG.old.1771975134401496
...
2026/02/24-15:17:45.977142 140620264883968 [/compaction/compaction_job.cc:1950] 
[deletedTable] [JOB 2] Compacting 4@0 + 1@6 files to L6, score 1.00
2026/02/24-15:17:45.977152 140620264883968 [/compaction/compaction_job.cc:1954] 
[deletedTable] Compaction start summary: Base version 26 Base level 0, inputs: 
[166(1110B) 158(3197B) 151(7595B) 144(3777B)], [130(11KB)]
...
2026/02/24-15:17:45.989272 140620264883968 [le/delete_scheduler.cc:77] Deleted 
file 
/var/lib/hadoop-ozone/om/ozone-metadata520340/snapshot/om.db.candidate/om.db/000166.sst
 immediately, rate_bytes_per_sec 0, total_trash_size 0 max_trash_db_ratio 
0.250000
2026/02/24-15:17:45.989295 140620264883968 [le/delete_scheduler.cc:77] Deleted 
file 
/var/lib/hadoop-ozone/om/ozone-metadata520340/snapshot/om.db.candidate/om.db/000158.sst
 immediately, rate_bytes_per_sec 0, total_trash_size 0 max_trash_db_ratio 
0.250000
...
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to