[ https://issues.apache.org/jira/browse/HDDS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17952573#comment-17952573 ]
Swaminathan Balachandran edited comment on HDDS-12090 at 5/19/25 12:00 PM: --------------------------------------------------------------------------- Actually such problems do exists in the system where we do pause all background services which is just not enough. If a rocksdb is open we could be in an inconsistent state where a rocksdb wal flush or any background operation could occur in the rocksdb which could create an inconsistent snapshot rocksdb after creating a checkpoint. We eventually want to get rid of the bootstrap lock instead rely on a central snapshot cache lock which would prevent any rocksdb from opening while a bootstrap copy batch is running. was (Author: swamirishi): Actually such problems do exists in the system where we do pause all background services which is just not enough. If a rocksdb is open we could be in an inconsistent state where a rocksdb wal flush or any background operation could occur in the rocksdb which could create an inconsistent snapshot. We eventually want to get rid of the bootstrap lock instead rely on a central snapshot cache lock which would prevent any rocksdb from opening while a bootstrap copy batch is running. > Fix Snapshot Bootstrapping race condition to prevent snapshot corruption > ------------------------------------------------------------------------ > > Key: HDDS-12090 > URL: https://issues.apache.org/jira/browse/HDDS-12090 > Project: Apache Ozone > Issue Type: Bug > Reporter: Swaminathan Balachandran > Assignee: Swaminathan Balachandran > Priority: Major > > Currently there is an issue with the existing bootstrapping logic when > dealing with Snapshotted OM Rocksdb. While bootstrapping no locks are taken > and the bootstrapping runs along with active transactions happening on the > snapshot rocksdb which could lead to having a corrupted Rocksdb instance post > bootstrap on the follower OM. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org