Hi all, I would like to open a discussion on FLIP-248: Introduce dynamic partition pruning.
Currently, Flink supports static partition pruning: the conditions in the WHERE clause are analyzed to determine in advance which partitions can be safely skipped in the optimization phase. Another common scenario: the partitions information is not available in the optimization phase but in the execution phase. That's the problem this FLIP is trying to solve: dynamic partition pruning, which could reduce the partition table source IO. The query pattern looks like: select * from store_returns, date_dim where sr_returned_date_sk = d_date_sk and d_year = 2000 We will introduce a mechanism for detecting dynamic partition pruning patterns in optimization phase and performing partition pruning at runtime by sending the dimension table results to the SplitEnumerator of fact table via existing coordinator mechanism. You can find more details in FLIP-248 document[1]. Looking forward to your any feedback. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-248%3A+Introduce+dynamic+partition+pruning [2] POC: https://github.com/godfreyhe/flink/tree/FLIP-248 Best, Godfrey