[ https://issues.apache.org/jira/browse/IGNITE-25464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Puchkovskiy updated IGNITE-25464: --------------------------------------- Description: TxCleanupRequestHandler sends WriteIntentSwitchReplicaRequest to the replica. If the replica responds synchronously (not via Raft future), a ClassCastException happens (because TxCleanupRequestHandler expects a ReplicaResult with a WriteIntentSwitchReplicatedInfo inside, but instead it gets a WriteIntentSwitchReplicatedInfo, not wrapped in a ReplicaRequest. When TxCleanupRequestHandler gets an exception of any kind, it retries the WI switch request. As first request has actually succeeded, this makes it 2. Original exception is returned in a response to TxCleanupRequestSender, but it ignores it and treats any response as an indication of success. To sum up: # We need a test that verifies that in different scenarios (only reads in RW from a partition, writes to a partition) just 1 WI switch is made on transaction finish # Class cast exception should be eliminated # If an exception happens, it should not be silently swallowed. We probably need to write it to log at WARN indicating that this is just a cleanup problem and that it does not influence the transaction outcome was: TxCleanupRequestHandler sends WriteIntentSwitchReplicaRequest to the replica. If the replica responds synchronously (not via Raft future), a ClassCastException happens (because TxCleanupRequestHandler expects a ReplicaResult with a WriteIntentSwitchReplicatedInfo inside, but instead it gets a WriteIntentSwitchReplicatedInfo, not wrapped in a ReplicaRequest. > Double write intent switch > -------------------------- > > Key: IGNITE-25464 > URL: https://issues.apache.org/jira/browse/IGNITE-25464 > Project: Ignite > Issue Type: Bug > Reporter: Roman Puchkovskiy > Assignee: Roman Puchkovskiy > Priority: Major > > TxCleanupRequestHandler sends WriteIntentSwitchReplicaRequest to the replica. > If the replica responds synchronously (not via Raft future), a > ClassCastException happens (because TxCleanupRequestHandler expects a > ReplicaResult with a > WriteIntentSwitchReplicatedInfo inside, but instead it gets a > WriteIntentSwitchReplicatedInfo, not wrapped in a ReplicaRequest. > When TxCleanupRequestHandler gets an exception of any kind, it retries the WI > switch request. As first request has actually succeeded, this makes it 2. > Original exception is returned in a response to TxCleanupRequestSender, but > it ignores it and treats any response as an indication of success. > To sum up: > # We need a test that verifies that in different scenarios (only reads in RW > from a partition, writes to a partition) just 1 WI switch is made on > transaction finish > # Class cast exception should be eliminated > # If an exception happens, it should not be silently swallowed. We probably > need to write it to log at WARN indicating that this is just a cleanup > problem and that it does not influence the transaction outcome -- This message was sent by Atlassian Jira (v8.20.10#820010)