Hi, Igniters! There is a known issue that ScanQuery on unstable topology returns incorrect results: duplicates data [1] or fails with an exception [2]. The reason is ScanQuery doesn't reserve partitions.
IndexQuery shares the same query processing as ScanQuery, and then it is also affected by unstable topology. I want to fix it for IndexQuery. IndexQuery should provide the same stability as SQL queries do - no occasional failures or data duplication. I dived into the SQL query processing and found that we can unify logic for SQL queries and cache queries (ScanQuery, IndexQuery, TextQuery): 1. Currently, cache queries use `GridCacheQueryAdapter#nodes` to find nodes to run a query. From the other side, SQL processing uses for the same goal `ReducePartitionMapper#nodesForPartitions`. It looks like those methods have some slight differences, but in the common, they do the same. So, I propose to make a single method for both cases. 2. SQL processing uses `PartitionReservationManager` for reserve partitions on the map side. I propose to move it to the ignite-core module and start using it for the cache queries. 3. Implement retries for the cache queries in case we failed to reserve partitions on the map side. Currently, I see a downside of reserving partitions for cache queries: cache queries are lazy. And the time of partition reservation depends on a user's application code (how fast a cursor is iterated and closed). AFAIU, it's not very good to have a partition in reserve too long. Please, correct me if I'm wrong here. But from the other side, Ignite reserves partitions for ScanQuery when a partition has been specified as a ScanQuery parameter, and Ignite reserves partitions for SQL with the flag lazy=true. Also: - IndexQuery: I expect simple queries that will return a relatively small amount of data. Then partitions wouldn't be reserved too much time. - the same is for TextQuery - it returns a limited amount of data (due to the Lucene logic). - full ScanQuery it's not in use much, AFAIK. So, it's by default a pretty heavy operation. So, I think it's safe to reserve partitions in any case. But there could be an alternative optimistic approach, smth like that: 1. Check that topology is stable and waiting while it's not stabilized. 2. Run a query on a stable cluster. 3. In cases when a cluster becomes unstable during query execution - try to reserve partitions at runtime (query should be aware of topology changes) and fail in case of reservation failure (if a user already fetched some data). I don't like the idea of this optimistic approach because in case a user got some data, we don't have a better solution than to fail a query in case of cluster instability and reservation failure. Igniters, WDYT? [1] https://issues.apache.org/jira/browse/IGNITE-12591 [2] https://issues.apache.org/jira/browse/IGNITE-16031