Re: Predicate Push Down Vs On Clause

2019-04-28 Thread Gopal Vijayaraghavan
> Yes both of these are valid ways of filtering data before join in Hive. This has several implementation specifics attached to it. If you're looking at Hive 1.1 or before, it might not work the same way as Vineet mentioned. In older versions Calcite rewrites aren't triggered, which prevented so

Re: Predicate Push Down Vs On Clause

2019-04-28 Thread Vineet Garg
Hi Varun, Yes both of these are valid ways of filtering data before join in Hive. As long as the join is not outer and the ON condition is not on non-null generating side of join Hive planner will try to push the predicate down to table scan. In fact Hive goes one step ahead and also generate IS

Predicate Push Down Vs On Clause

2019-04-28 Thread Varun Rao
When performing a join in Hive and then filtering the output with a where clause, the Hive compiler will try to filter data before the tables are joined. This is known as predicate pushdown ( http://allabouthadoop.net/what-is-predicate-pushdown-in-hive/) For example: SELECT * FROM a JOIN b ON a.s