Hi all,

I would like to open a discussion on FLIP-248: Introduce dynamic
partition pruning.

 Currently, Flink supports static partition pruning: the conditions in
the WHERE clause are analyzed
to determine in advance which partitions can be safely skipped in the
optimization phase.
Another common scenario: the partitions information is not available
in the optimization phase but in the execution phase.
That's the problem this FLIP is trying to solve: dynamic partition
pruning, which could reduce the partition table source IO.

The query pattern looks like:
select * from store_returns, date_dim where sr_returned_date_sk =
d_date_sk and d_year = 2000

We will introduce a mechanism for detecting dynamic partition pruning
patterns in optimization phase
and performing partition pruning at runtime by sending the dimension
table results to the SplitEnumerator
of fact table via existing coordinator mechanism.

You can find more details in FLIP-248 document[1].
Looking forward to your any feedback.

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning
[2] POC: https://github.com/godfreyhe/flink/tree/FLIP-248


Best,
Godfrey

Reply via email to