[ https://issues.apache.org/jira/browse/HDDS-12982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chia-Chuan Yu reassigned HDDS-12982: ------------------------------------ Assignee: Chia-Chuan Yu > [Snapshot] Tone down error "Snapshot validation failed" > ------------------------------------------------------- > > Key: HDDS-12982 > URL: https://issues.apache.org/jira/browse/HDDS-12982 > Project: Apache Ozone > Issue Type: Sub-task > Components: Snapshot > Affects Versions: 2.0.0 > Reporter: Wei-Chiu Chuang > Assignee: Chia-Chuan Yu > Priority: Major > > We sometimes observe this error > {noformat} > 2024-12-02 18:47:09,542 ERROR [OM StateMachine ApplyTransaction Thread - > 0]-org.apache.hadoop.ozone.om.request.key.OMKeyPurgeRequest: Error occurred > while performing OmKeyPurge. > INVALID_REQUEST org.apache.hadoop.ozone.om.exceptions.OMException: Snapshot > validation failed. Expected previous snapshotId : null but was > e639ca0c-73a1-4a60-8da5-c21ed9634210 > at > org.apache.hadoop.ozone.om.snapshot.SnapshotUtils.validatePreviousSnapshotId(SnapshotUtils.java:303) > at > org.apache.hadoop.ozone.om.request.key.OMKeyPurgeRequest.validateAndUpdateCache(OMKeyPurgeRequest.java:83) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:378) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.runCommand(OzoneManagerStateMachine.java:560) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerStateMachine.lambda$1(OzoneManagerStateMachine.java:353) > at > java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > {noformat} > According to the analysis made by [~swamirishi] and [~sshenoy] this exception > is acceptable. > bq. This is kind of a race condition with snapshotCreate and keyPurge > happening on the background service. Purge operation takes a lesser priority > over snapshot create. So this kind of error can be ignored. > I am aware we'll be implementing locking for the snapshot operations so race > condition shouldn't happen after that. I'd suggest to reduce the log level > from ERROR to WARN. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org