The scenario here I am observing is that I am running tpcds query 4 on both partitioned and non-partitioned data. So here in the query same CTE year_total is being used twice with name first_year_total and second_year_total. And these two tables are being filtered with the year using the d_year column in the data_dim. Which is helping to filter the data of the tables based on data. So lets first write the join here happening. So here for each a broadcast hash join will there and and these two tables are joined let say here with shuffle hash join.
So, in case of partitioned data where data is bucketed in the disk with the data as the bucketing key , so it is using the dynamic partition pruning and scanning the relevant data only. And then applying the shuffle hash join and then the broadcast hash join. And in case of data which is stored non-partitioned first it is scanning the tables, which I can observer from the spark sql plan for the query. And then later after scanning it is doing the shuffle hash join and then the broadcast hash join. So here I have two questions: 1. Here in case of partitioned data as it is doing the pruning at runtime then why it is again doing the broadcast hash join with the date table. 2. The second question here ,which is the main questions I want to ask, is that as the predicate pushdown says spark will do the filtering as near as possible to the source. So, in case of non-partitioned why spark is not applying the filter just after the scan so that I can reduce the later shuffle data.As I am getting the shuffle data twice in non-partitioned compared to partitioned data. Why this behaviour.