[ https://issues.apache.org/jira/browse/IGNITE-25237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ivan Bessonov updated IGNITE-25237: ----------------------------------- Priority: Critical (was: Major) > ItMetaStorageWatchTest.testReplayUpdates failed > ----------------------------------------------- > > Key: IGNITE-25237 > URL: https://issues.apache.org/jira/browse/IGNITE-25237 > Project: Ignite > Issue Type: Bug > Reporter: Ivan Bessonov > Assignee: Ivan Bessonov > Priority: Critical > Labels: ignite-3 > Time Spent: 10m > Remaining Estimate: 0h > > [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/9081157?logFilter=debug&logView=flowAware] > > Root cause: > {code:java} > @Override > public void startWatches(long startRevision, WatchEventHandlingCallback > callback) { > assert startRevision > 0 : startRevision; > long currentRevision; > synchronized (watchProcessorMutex) { > watchProcessor.setWatchEventHandlingCallback(callback); > currentRevision = rev; > -------------------------------------------------------------------------------------------------------------------------------------- > | Here we already fixed the revision that we will be using for reading from > the storage. All updates after it should go into watch processor. > -------------------------------------------------------------------------------------------------------------------------------------- > // We update the recovery status under the read lock in order to > avoid races between starting watches and applying a snapshot > // or concurrent writes. Replay of events can be done outside of the > read lock relying on RocksDB snapshot isolation. > if (currentRevision == 0) { > recoveryStatus.set(RecoveryStatus.DONE); > } else { > // If revision is not 0, we need to replay updates that match the > existing data. > recoveryStatus.set(RecoveryStatus.IN_PROGRESS); > } > } {code} > {code:java} > private void queueWatchEvent() { > if (recoveryStatus.get() == RecoveryStatus.INITIAL) { > -------------------------------------------------------------------------------------------------------------------------------------- > | Here is the race. "currentRevision = rev" might already be executed, but > status is still INITIAL. We have to synchronize the access > -------------------------------------------------------------------------------------------------------------------------------------- > // Watches haven't been enabled yet, no need to queue any events, > they will be replayed upon recovery. > updatedEntries.clear(); > } else { > notifyWatchProcessor(updatedEntries.toNotifyWatchProcessorEvent(rev)); > } > } {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)