Re: How to make Spark-sql join using HashJoin

2014-10-06 Thread Liquan Pei
In Spark 1.0, outer join are resolved to BroadcastNestedLoopJoin. You can use Spark 1.1 which resolves outer join to hash joins. Hope this helps! Liqua On Mon, Oct 6, 2014 at 4:20 PM, Benyi Wang wrote: > I'm using CDH 5.1.0 with Spark-1.0.0. There is spark-sql-1.0.0 in > clouder'a maven reposit

How to make Spark-sql join using HashJoin

2014-10-06 Thread Benyi Wang
I'm using CDH 5.1.0 with Spark-1.0.0. There is spark-sql-1.0.0 in clouder'a maven repository. After put it into the classpath, I can use spark-sql in my application. One of issue is that I couldn't make the join as a hash join. It gives CartesianProduct when I join two SchemaRDDs as follows: scal