[ 
https://issues.apache.org/jira/browse/IGNITE-28199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov updated IGNITE-28199:
--------------------------------------
    Description: 
Current implementation of partition pruning (PP) collection algorithm does 
collect metadata for DML statements that reference multiple sources (see 
examples) or have nested queries. This is a limitation is result of current 
implementation of the algorithm that has two separate paths for traversing rel 
node trees - a path for queries (PartitionPruningMetadataExtractor is also a 
visitor) and a path for DMLs(ModifyNodeVisitor). The path for DMLs is very 
conservative and it rejects many valid cases. 

{noformat}
-- These statements have two sources each - a source for ModifyNode and another 
source for ScanNode, FunctionScan `breaks` traversal that collects metadata, so 
resulting metadata is absent:

--- expected: t(UPDATE)={id=1}, t(SELECT)={id=1}
UPDATE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM SYSTEM_RANGE(1, 100))
--- expected: t(DELETE)={id=1}, t(SELECT)={id=1}
DELETE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM SYSTEM_RANGE(1, 100))

--- Does not capture metadata because it has a nested query:

--- expected: t(UPDATE)={id=1}, t(SELECT)={id=1}, t2=1,215 t2={id=42}
UPDATE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM t2 WHERE id = 42)
{noformat}

*Proposed solution*:
Implement unified bottom-up traversal that in addition to collecting metadata 
from ScanNodeS, propagates collected metadata up to ModifyNodeS.




  was:
Current implementation of partition pruning (PP) collection algorithm does 
collect metadata for DML statements that reference multiple sources (see 
examples) or have nested queries. This is a limitation is result of current 
implementation of the algorithm that has two separate paths for traversing rel 
node trees - a path for queries (PartitionPruningMetadataExtractor is also a 
visitor) and a path for DMLs(ModifyNodeVisitor). The path for DMLs is very 
conservative and it rejects many valid cases. 

{noformat}
-- These statements have two sources each - a source for ModifyNode and another 
source for ScanNode, FunctionScan `breaks` traversal that collects metadata, so 
resulting metadata is absent:

--- expected: t(UPDATE)={id=1}, t(SELECT)={id=1}
UPDATE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM SYSTEM_RANGE(1, 100))
--- expected: t(DELETE)={id=1}, t(SELECT)={id=1}
DELETE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM SYSTEM_RANGE(1, 100))

--- Does not capture metadata because it has a nested query:

--- expected: t(UPDATE)={id=1}, t(SELECT)={id=1}, t2=1,215 t2={id=42}
UPDATE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM t2 WHERE id = 42)
{noformat}

*Proposed solution*:
Implement unified bottom-up traversal that in addition to collecting metadata 
from ScanNodeS, propagates that up to ModifyNodeS.





> Sql. Partition pruning. Single bottom-up traversal for both queries and DMLs
> ----------------------------------------------------------------------------
>
>                 Key: IGNITE-28199
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28199
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql ai3
>            Reporter: Maksim Zhuravkov
>            Priority: Major
>              Labels: ignite-3
>
> Current implementation of partition pruning (PP) collection algorithm does 
> collect metadata for DML statements that reference multiple sources (see 
> examples) or have nested queries. This is a limitation is result of current 
> implementation of the algorithm that has two separate paths for traversing 
> rel node trees - a path for queries (PartitionPruningMetadataExtractor is 
> also a visitor) and a path for DMLs(ModifyNodeVisitor). The path for DMLs is 
> very conservative and it rejects many valid cases. 
> {noformat}
> -- These statements have two sources each - a source for ModifyNode and 
> another source for ScanNode, FunctionScan `breaks` traversal that collects 
> metadata, so resulting metadata is absent:
> --- expected: t(UPDATE)={id=1}, t(SELECT)={id=1}
> UPDATE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM SYSTEM_RANGE(1, 100))
> --- expected: t(DELETE)={id=1}, t(SELECT)={id=1}
> DELETE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM SYSTEM_RANGE(1, 100))
> --- Does not capture metadata because it has a nested query:
> --- expected: t(UPDATE)={id=1}, t(SELECT)={id=1}, t2=1,215 t2={id=42}
> UPDATE t SET c1=100 WHERE id=1 and c2 IN (SELECT * FROM t2 WHERE id = 42)
> {noformat}
> *Proposed solution*:
> Implement unified bottom-up traversal that in addition to collecting metadata 
> from ScanNodeS, propagates collected metadata up to ModifyNodeS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to