[ https://issues.apache.org/jira/browse/IGNITE-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Maksim Zhuravkov updated IGNITE-21277: -------------------------------------- Ignite Flags: (was: Docs Required,Release Notes Required) > Sql. Partition pruning. Port partitionExtractor from AI2. > --------------------------------------------------------- > > Key: IGNITE-21277 > URL: https://issues.apache.org/jira/browse/IGNITE-21277 > Project: Ignite > Issue Type: Improvement > Components: sql > Affects Versions: 3.0.0-beta2 > Reporter: Maksim Zhuravkov > Priority: Major > Labels: ignite-3 > > In order to prune unnecessary partitions we need to obtain information that > includes possible "values" of colocation key columns from filter expressions > for every scan operators prior to statement execution. This can be > accomplished by traversing an expression tree of scan's filter and collecting > expressions with colocation key columns (this data is called partition > pruning metadata for simplicity). > 1. Implement a component that takes a physical plan and analyses filter > expressions of every scan operator and creates (if possible) an expression > that includes all colocated columns. (The PartitionExtractor from patch > (https://github.com/apache/ignite/pull/10928/files) can be used a reference > implementation). > Expression types to analyze: > * AND > * EQUALS > * IS_FALSE > * IS_NOT_DISTINCT_FROM > * IS_NOT_FALSE > * IS_NOT_TRUE > * IS_TRUE > * NOT > * OR > * SEARCH (operation that tests whether a value is included in a certain > range) > 2. Update QueryPlan to include partition pruning metadata for every scan > operator (source_id = <partition pruning metadata>). > Basic examples: > {code:java} > Statement: > SELECT * FROM t WHERE pk = 7 OR pk = 42 > Partition metadata: > t's source_id = [pk=7 || pk = 42] // Assuming colocation key is equal to > primary key, || denotes OR operation > Statement: > SELECT * FROM t WHERE pk = 7 OR col1 = 1 > Partition metadata: [] // Empty, because col1 is not part of a colocation key. > Statement: > SELECT * FROM t_colo_key1_colo_key2 WHERE colo_key1= 42 > Partition metadata: [] // Empty, because colo_key2 is missing > {code} > — > *Additional examples - partition pruning is possible* > Dynamic parameters: > {code:java} > SELECT * FROM t WHERE pk = ?1 > Partition pruning metadata: t = [ pk = ?1 ] > {code} > Colocation columns reside inside a nested expression: > {code:java} > SELECT * FROM t WHERE col1 = col2 AND (col2 = 100 AND pk = 2) > Partition pruning metadata: t = [ pk = 2 ] > {code} > Multiple keys: > {code:java} > SELECT * FROM t WHERE col_c1 = 1 AND col_c2 = 2 > Partition pruning metadata: t = [ (col_c1 = 1, col_c2 = 2) ] > {code} > Complex expression with multiple keys: > {code:java} > SELECT * FROM t WHERE (col_col1 = 100 and col_col2 = 4) OR (col_col1 = 4 and > col_col2 = 100) > Partition pruning metadata: t = [ (col_col1 = 100, col_col2 = 4) || (col_col1 > = 4, col_col2 = 100) ] > {code} > Multiple tables, assuming that filter b_id = 42 is pushed into scan b, > because a_id = b_id: > {code:java} > SELECT * FROM a JOIN b WHERE a_id = b_id AND a_id = 42 > Partition pruning metadata: a= [ a_id=42 ], b=[ b_id=42 ] > {code} > --- > *Additional examples - partition pruning is not possible* > Columns named col* are not part of colocation key: > {code:java} > SELECT * FROM t WHERE col1 = 10 > // Filter does not use colocation key columns. > Partition pruning metadata: [] > {code} > {code:java} > SELECT * FROM t WHERE col1 = col2 OR pk = 42 > // We need to scan all partitions to figure out which tuples have ‘col1 = > col2’ > Partition pruning metadata: [] > {code} > {code:java} > SELECT * FROM t WHERE col_col1 = 10 AND col_col2 OR col_col1 = 42 > // Although first expression uses all colocation key columns the second one > only uses some. > Partition pruning metadata: [] > {code} > {code:java} > SELECT * FROM t WHERE col_c1 = 1 OR col_c2 = 2 > // We need to scan all partitions to figure out which tuples have col_c1 = 1 > OR col_c2 = 2. > Partition pruning metadata: [] > {code} > {code:java} > SELECT * FROM t WHERE col_col1 = col_col2 OR col_col2 = 42 > // We need to scan all partitions to figure out which tuples have ‘col_col1 = > col_col2’ > Partition pruning metadata: [] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)