[DISCUSSION] Reserve partitions for CacheQueries

Maksim Timonin Wed, 08 Dec 2021 09:12:12 -0800

Hi, Igniters!

There is a known issue that ScanQuery on unstable topology returns
incorrect results: duplicates data [1] or fails with an exception [2]. The
reason is ScanQuery doesn't reserve partitions.


IndexQuery shares the same query processing as ScanQuery, and then it is
also affected by unstable topology. I want to fix it for IndexQuery.
IndexQuery should provide the same stability as SQL queries do - no
occasional failures or data duplication.

I dived into the SQL query processing and found that we can unify logic for
SQL queries and cache queries (ScanQuery, IndexQuery, TextQuery):

1. Currently, cache queries use `GridCacheQueryAdapter#nodes` to find nodes
to run a query. From the other side, SQL processing uses for the same goal
`ReducePartitionMapper#nodesForPartitions`. It looks like those methods
have some slight differences, but in the common, they do the same. So, I
propose to make a single method for both cases.

2. SQL processing uses `PartitionReservationManager` for reserve partitions
on the map side. I propose to move it to the ignite-core module and start
using it for the cache queries.

3. Implement retries for the cache queries in case we failed to reserve
partitions on the map side.

Currently, I see a downside of reserving partitions for cache queries:
cache queries are lazy. And the time of partition reservation depends on a
user's application code (how fast a cursor is iterated and closed). AFAIU,
it's not very good to have a partition in reserve too long. Please, correct
me if I'm wrong here.

But from the other side, Ignite reserves partitions for ScanQuery when a
partition has been specified as a ScanQuery parameter, and Ignite reserves
partitions for SQL with the flag lazy=true. Also:
- IndexQuery: I expect simple queries that will return a relatively small
amount of data. Then partitions wouldn't be reserved too much time.
- the same is for TextQuery - it returns a limited amount of data (due to
the Lucene logic).
- full ScanQuery it's not in use much, AFAIK. So, it's by default a pretty
heavy operation.

So, I think it's safe to reserve partitions in any case. But there could be
an alternative optimistic approach, smth like that:
1. Check that topology is stable and waiting while it's not stabilized.
2. Run a query on a stable cluster.
3. In cases when a cluster becomes unstable during query execution - try to
reserve partitions at runtime (query should be aware of topology changes)
and fail in case of reservation failure (if a user already fetched some
data).

I don't like the idea of this optimistic approach because in case a user
got some data, we don't have a better solution than to fail a query in case
of cluster instability and reservation failure.

Igniters, WDYT?

[1] https://issues.apache.org/jira/browse/IGNITE-12591
[2] https://issues.apache.org/jira/browse/IGNITE-16031

[DISCUSSION] Reserve partitions for CacheQueries

Reply via email to