It's a bunch of strategies defined here: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
In most common use cases (e.g. inner equi join), filters are pushed below the join or into the join. Doing a cartesian product followed by a filter is too expensive. On Thu, Jan 15, 2015 at 7:39 AM, Alessandro Baretta <alexbare...@gmail.com> wrote: > Hello, > > Where can I find docs about how joins are implemented in SparkSQL? In > particular, I'd like to know whether they are implemented according to > their relational algebra definition as filters on top of a cartesian > product. > > Thanks, > > Alex >