[
https://issues.apache.org/jira/browse/IGNITE-25237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Bessonov updated IGNITE-25237:
-----------------------------------
Fix Version/s: 3.1
> ItMetaStorageWatchTest.testReplayUpdates failed
> -----------------------------------------------
>
> Key: IGNITE-25237
> URL: https://issues.apache.org/jira/browse/IGNITE-25237
> Project: Ignite
> Issue Type: Bug
> Reporter: Ivan Bessonov
> Assignee: Ivan Bessonov
> Priority: Critical
> Labels: ignite-3
> Fix For: 3.1
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/9081157?logFilter=debug&logView=flowAware]
>
> Root cause:
> {code:java}
> @Override
> public void startWatches(long startRevision, WatchEventHandlingCallback
> callback) {
> assert startRevision > 0 : startRevision;
> long currentRevision;
> synchronized (watchProcessorMutex) {
> watchProcessor.setWatchEventHandlingCallback(callback);
> currentRevision = rev;
> --------------------------------------------------------------------------------------------------------------------------------------
> | Here we already fixed the revision that we will be using for reading from
> the storage. All updates after it should go into watch processor.
> --------------------------------------------------------------------------------------------------------------------------------------
> // We update the recovery status under the read lock in order to
> avoid races between starting watches and applying a snapshot
> // or concurrent writes. Replay of events can be done outside of the
> read lock relying on RocksDB snapshot isolation.
> if (currentRevision == 0) {
> recoveryStatus.set(RecoveryStatus.DONE);
> } else {
> // If revision is not 0, we need to replay updates that match the
> existing data.
> recoveryStatus.set(RecoveryStatus.IN_PROGRESS);
> }
> } {code}
> {code:java}
> private void queueWatchEvent() {
> if (recoveryStatus.get() == RecoveryStatus.INITIAL) {
> --------------------------------------------------------------------------------------------------------------------------------------
> | Here is the race. "currentRevision = rev" might already be executed, but
> status is still INITIAL. We have to synchronize the access
> --------------------------------------------------------------------------------------------------------------------------------------
> // Watches haven't been enabled yet, no need to queue any events,
> they will be replayed upon recovery.
> updatedEntries.clear();
> } else {
> notifyWatchProcessor(updatedEntries.toNotifyWatchProcessorEvent(rev));
> }
> } {code}
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)