Ethan Rose created HDDS-11743:
---------------------------------
Summary: Provide debug/repair commands for OM DB import/export
Key: HDDS-11743
URL: https://issues.apache.org/jira/browse/HDDS-11743
Project: Apache Ozone
Issue Type: Sub-task
Components: Ozone Manager
Reporter: Ethan Rose
Sometimes if an OM follower gets into a bad state, it can be faster to copy the
leader DB over and restart them than depend on Ratis to catch them up. With the
introduction of filesystem snapshots, however, manually copying the OM DB has
become much more complicated. The goal of this Jira is to provide CLI commands
that can automate all parts of the DB import/export process except the network
copy. Flow would look something like this:
# {{ozone debug om export-db --db=<om-db-location>}} would create a tarball of
the current OM DB and its snapshots.
** Tar should preserve the hardlinks by default without taking extra space.
** An optional {{--output}} option can be used if the current disk does not
have enough space.
# DB tarball is manually copied over the network to the follower OM node. This
removes all auth work from the CLI.
# {{ozone repair om import-db --db=<source-db-tarball>
--destination=<om-db-dir>}} would take the tarball and unpack it to the
configured directory.
** The CLI could fail if the existing OM DB is already there. In this case it
should instruct the user to manually move it to a backup location.
** The {{ozone repair}} command already has a warning about running as a user
with correct permissions, which will prevent errors with unreadable SST files
we have seen in the past.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]