Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread Jie Han
Hmmm… Sorry, I don’t have an idea. Maybe we can try subquery? I’m not sure whether it can work :( . We need help from other members of the community. > 2022年11月12日 00:10,hajyoussef amine 写道: > > Hi Jie, > Let's suppose we have ((dimension_table Join fact_table1) join fact_table2). > In the cas

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread hajyoussef amine
Hi Jie, Let's suppose we have ((dimension_table Join fact_table1) join fact_table2). In the case where (dimension_table JOIN fact_table1) is small enough, the result ideally can be treated as another dimension table and thus used to prune the fact_table2. I don't find an easy way to implement it th

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread Jie Han
FYI, https://medium.com/@prabhakaran.electric/spark-3-0-feature-dynamic-partition-pruning-dpp-to-avoid-scanning-irrelevant-data-1a7bbd006a89 This blog may be

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-11 Thread hajyoussef amine
Hi Jie, Thank you for the response. Dynamic pruning work to filter prune the first join not the second one. so in the example I shared above. big_table is partition pruned but bigger_table is not. Here's the result of running explain extended on the following query: Select * FROM jlee_ntm.tt_om_po

Re: [Spark Core] Adaptive dynamic partition pruning

2022-11-10 Thread Jie Han
Which version are you using? I test it in spark 3.2.1 and sure that dynamic pruning works in queries with multi joins. BTW, could you execute ‘explain extended your sql’? > 2022年11月10日 02:10,hajyoussef amine 写道: > > Hello everyone, > > Let me take the following spark sql example to demonstrate