[jira] [Updated] (IGNITE-21277) Sql. Partition pruning. Extract partition pruning information from scan operations

Maksim Zhuravkov (Jira) Tue, 16 Jan 2024 23:21:04 -0800


     [ 
https://issues.apache.org/jira/browse/IGNITE-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Maksim Zhuravkov updated IGNITE-21277:
--------------------------------------
    Affects Version/s: 3.0.0-beta2

> Sql. Partition pruning. Extract partition pruning information from scan 
> operations
> ----------------------------------------------------------------------------------
>
>                 Key: IGNITE-21277
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21277
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>    Affects Versions: 3.0.0-beta2
>            Reporter: Maksim Zhuravkov
>            Priority: Major
>              Labels: ignite-3
>
> In order to prune unnecessary partitions we need to obtain information that 
> includes possible "values" of colocation key columns from filter expressions 
> for every scan operators (such as IgniteTableScan, IgniteIndexScan and 
> IgniteSystemViewScan - or simply subclasses of all 
> ProjectableFilterableTableScan) prior to statement execution. This can be 
> accomplished by traversing an expression tree of scan's filter and collecting 
> expressions with colocation key columns (This data is called partition 
> pruning metadata for simplicity).
> 1. Implement a component that takes a physical plan and analyses filter 
> expressions of every scan operator and creates (if possible) an expression 
> that includes all colocated columns. (The PartitionExtractor from patch can 
> be used a reference implementation).
> Basic example:
>  
> {code:java}
> Statement: 
> SELECT * FROM t WHERE pk = 7 OR pk = 42
> Partition metadata: 
> t's source_id = [pk=7 || pk = 42] // Assuming colocation key is equal to 
> primary key, || denotes OR operation 
> {code}
>  
> If some colocation key columns are missing from then filter, then partition 
> pruning is not possible for such operation.
> Expression types to analyze:
>  * AND
>  * EQUALS
>  * IS_FALSE
>  * IS_NOT_DISTINCT_FROM
>  * IS_NOT_FALSE
>  * IS_NOT_TRUE
>  * IS_TRUE
>  * NOT
>  * OR
>  * SEARCH (operation that tests whether a value is included in a certain 
> range)
> 2. Update QueryPlan to include partition pruning metadata for every scan 
> operator (source_id = <expression>).
> —
> *Additional examples - partition pruning is possible*
> Dynamic parameters:
> {code:java}
> SELECT * FROM t WHERE pk = ?1 
> Partition pruning metadata: t = [ pk = ?1 ]
> {code}
> Colocation columns reside inside a nested expression:
> {code:java}
> SELECT * FROM t WHERE col1 = col2 AND (col2 = 100 AND pk = 2) 
> Partition pruning metadata: t = [ pk = 2 ]
> {code}
> Multiple keys:
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 AND col_c2 = 2 
> Partition pruning metadata:  t = [ (col_c1 = 1, col_c2 = 2) ]
> {code}
> Complex expression with multiple keys:
> {code:java}
> SELECT * FROM t WHERE (col_col1 = 100 and col_col2 = 4) OR (col_col1 = 4 and 
> col_col2 = 100)
> Partition pruning metadata: t = [ (col_col1 = 100, col_col2 = 4) || (col_col1 
> = 4, col_col2 = 100) ]
> {code}
> Multiple tables, assuming that filter b_id = 42 is pushed into scan b, 
> because a_id = b_id:
> {code:java}
> SELECT * FROM a JOIN b WHERE a_id = b_id AND a_id = 42 
> Partition pruning metadata: a= [ a_id=42 ], b=[ b_id=42 ]
> {code}
> ---
> *Partition pruning is not possible*
> Columns named col* are not part of colocation key:
> {code:java}
> SELECT * FROM t WHERE col1 = 10 
> Partition pruning metadata: [] // (empty) because filter does not use 
> colocation key columns.
> {code}
> {code:java}
> SELECT * FROM t WHERE col1 = col2 OR pk = 42 
> // Pruning is not possible because we need to scan all partitions to figure 
> out which tuples have ‘col1 = col2’
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = 10 AND col_col2 OR col_col1 = 42
> // Although first expression uses all colocation key columns the second one 
> only uses some.
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 OR col_c2 = 2 
> // Empty partition pruning metadata:  need to scan all partitions to figure 
> out which tuples have col_c1 = 1 OR col_c2 = 2.
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = col_col2 OR col_col2 = 42 
> // Empty partition pruning metadata: need to scan all partitions to figure 
> out which tuples have ‘col_col1 = col_col2’
> Partition pruning metadata: [] 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (IGNITE-21277) Sql. Partition pruning. Extract partition pruning information from scan operations

Reply via email to