Roman Puchkovskiy created IGNITE-19655: ------------------------------------------
Summary: Distributed Sql keeps mapping query fragments to a node that has already left Key: IGNITE-19655 URL: https://issues.apache.org/jira/browse/IGNITE-19655 Project: Ignite Issue Type: Bug Reporter: Roman Puchkovskiy Assignee: Maksim Zhuravkov Fix For: 3.0.0-beta2 There are two test failures: [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7271211?expandCode+Inspection=true&expandBuildProblemsSection=true&hideProblemsFromDependencies=false&expandBuildTestsSection=true&hideTestsFromDependencies=false] and [https://ci.ignite.apache.org/buildConfiguration/ApacheIgnite3xGradle_Test_RunAllTests/7272905?hideProblemsFromDependencies=false&hideTestsFromDependencies=false&expandCode+Inspection=true&expandBuildProblemsSection=true&expandBuildChangesSection=true&expandBuildTestsSection=true] (org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.entriesKeepAppendedAfterSnapshotInstallation and org.apache.ignite.internal.raftsnapshot.ItTableRaftSnapshotsTest.snapshotInstallTimeoutDoesNotBreakSubsequentInstallsWhenSecondAttemptIsIdenticalToFirst, correspondingly). In both cases, the test code creates a table with 3 replicas on a cluster of 3 nodes, then it stops the last node and tries to make an insert using one of the 2 remaining nodes. The RAFT majority (2 of 3) is still preserved, so the insert should succeed. It's understood that the insert might be issued before the remaining nodes understand that the third node has left, so we have a retry mechanism in place, it makes up to 5 attempts for almost 8 seconds (in total). But in both the failed runs, each of 5 attempts failed because a fragment of the INSERT query was mapped to the missing node. This seems to be a bad luck (as the tests pass most of the time, fail rate is about 2.5%), but anyway: the SQL engine does not seem to care about the fact that the node has already left. Probably, the SQL engine should track the Logical Topology events and avoid mapping query fragments to the missing nodes. -- This message was sent by Atlassian Jira (v8.20.10#820010)