Hi Alex,
Can you attach the output of sql("explain extended ").collect.foreach(println)?
Thanks,
Yin
On Fri, Jan 16, 2015 at 1:54 PM, Alessandro Baretta
wrote:
> Reynold,
>
> The source file you are directing me to is a little too terse for me to
> understand what exactly is going on. Let me
Reynold,
The source file you are directing me to is a little too terse for me to
understand what exactly is going on. Let me tell you what I'm trying to do
and what problems I'm encountering, so that you might be able to better
direct me investigation of the SparkSQL codebase.
I am computing the
Xin
Cc: Alessandro Baretta; dev@spark.apache.org
Subject: Re: Join implementation in SparkSQL
What Reynold is describing is a performance optimization in implementation, but
the semantics of the join (cartesian product plus relational algebra
filter) should be the same and produce the same
What Reynold is describing is a performance optimization in implementation,
but the semantics of the join (cartesian product plus relational algebra
filter) should be the same and produce the same results.
On Thu, Jan 15, 2015 at 1:36 PM, Reynold Xin wrote:
> It's a bunch of strategies defined h
It's a bunch of strategies defined here:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
In most common use cases (e.g. inner equi join), filters are pushed below
the join or into the join. Doing a cartesian product followed
Hello,
Where can I find docs about how joins are implemented in SparkSQL? In
particular, I'd like to know whether they are implemented according to
their relational algebra definition as filters on top of a cartesian
product.
Thanks,
Alex